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Preface 


This book aims to present a general survey of algebra, of its basic notions and 
main branches. Now what language should we choose for this? In reply to the 
question ‘What does mathematics study?’ it is hardly acceptable to answer 
‘structures’ or ‘sets with specified relations’; for among the myriad conceivable 
structures or sets with specified relations, only a very small discrete subset is of 
real interest to mathematicians, and the whole point of the question is to 
understand the special value of this infinitesimal fraction dotted among the 
amorphous masses. In the same way, the meaning of a mathematical notion is 
by no means confined to its formal definition; in fact, it may be rather better 
expressed by a (generally fairly small) sample of the basic examples, which serve 
the mathematician as the motivation and the substantive definition, and at the 
same time as the real meaning of the notion. 

Perhaps the same kind of difficulty arises if we attempt to characterise in terms 
of general properties any phenomenon which has any degree of individuality. 
For example, it doesn’t make sense to give a definition of the Germans or the 
French; one can only describe their history or their way of life. In the same way, 
it’s not possible to give a definition of an individual human being; one can only 
either give his ‘passport data’, or attempt to describe his appearance and charac- 
ter, and relate a number of typical events from his biography. This 1s the path 
we attempt to follow in this book, applied to algebra. Thus the book accom- 
modates the axiomatic and logical development of the subject together with more 
descriptive material: a careful treatment of the key examples and of points of 
contact between algebra and other branches of mathematics and the natural 
sciences. The choice of material here is of course strongly influenced by the 
author’s personal opinions and tastes. 
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As readers, I have in mind students of mathematics in the first years of an 
undergraduate course, or theoretical physicists or mathematicians from outside 
algebra wanting to get an impression of the spirit of algebra and its place in 
mathematics. Those parts of the book devoted to the systematic treatment of 
notions and results of algebra make very limited demands on the reader: we 
presuppose only that the reader knows calculus, analytic geometry and linear 
algebra in the form taught in many high schools and colleges. The extent of the 
prerequisites required in our treatment of examples is harder to state; an ac- 
quaintance with projective space, topological spaces, differentiable and complex 
analytic manifolds and the basic theory of functions of a complex variable is 
desirable, but the reader should bear in mind that difficulties arising in the 
treatment of some specific example are likely to be purely local in nature, and 
not to affect the understanding of the rest of the book. 

This book makes no pretence to teach algebra: it is merely an attempt to talk 
about it. I have attempted to compensate at least to some extent for this by giving 
a detailed bibliography; in the comments preceding this, the reader can find 
references to books from which he can study the questions raised in this book, 
and also some other areas of algebra which lack of space has not allowed us to 
treat. 

A preliminary version of this book has been read by F.A. Bogomolov, R.V. 
Gamkrelidze, S.P. Démushkin, A.I. Kostrikin, Yu.I. Manin, V.V. Nikulin, A.N. 
Parshin, M.K. Polyvanov, V.L. Popov, A.B. Roiter and A.N. Tyurin; I am 
grateful to them for their comments and suggestions which have been incor- 
porated in the book. 

I am extremely grateful to N.I. Shafarevich for her enormous help with the 
manuscript and for many valuable comments. 


Moscow, 1984 I.R. Shafarevich 


I have taken the opportunity in the English translation to correct a number 
of errors and inaccuracies which remained undetected in the original; I am very 
grateful to E.B. Vinberg, A.M. Volkhonskii and D. Zagier for pointing these out. 
I am especially grateful to the translator M. Reid for innumerable improvements 
of the text. 


Moscow, 1987 I.R. Shafarevich 
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§1. What is Algebra? 


What is algebra? Is it a branch of mathematics, a method or a frame of mind? 
Such questions do not of course admit either short or unambiguous answers. 
One can attempt a description of the place occupied by algebra in mathematics 
by drawing attention to the process for which Hermann Weyl coined the un- 
pronounceable word ‘coordinatisation’ (see [H. Weyl 109 (1939), Chap. I, §4]). 
An individual might find his way about the world relying exclusively on his sense 
organs, sight, feeling, on his experience of manipulating objects in the world 
outside and on the intuition resulting from this. However, there is another 
possible approach: by means of measurements, subjective impressions can be 
transformed into objective marks, into numbers, which are then capable of being 
preserved indefinitely, of being communicated to other individuals who have not 
experienced the same impressions, and most importantly, which can be operated 
on to provide new information concerning the objects of the measurement. 

The oldest example is the idea of counting (coordinatisation) and calculation 
(operation), which allow us to draw conclusions on the number of objects without 
handling them all at once. Attempts to ‘measure’ or to ‘express as a number’ a 
variety of objects gave rise to fractions and negative numbers in addition to the 
whole numbers. The attempt to express the diagonal of a square of side 1 as a 
number led to a famous crisis of the mathematics of early antiquity and to the 
construction of irrational numbers. 

Measurement determines the points of a line by real numbers, and much more 
widely, expresses many physical quantities as numbers. To Galileo is due the 
most extreme statement in his time of the idea of coordinatisation: ‘Measure 
everything that is measurable, and make measurable everything that is not yet 
so’. The success of this idea, starting from the time of Galileo, was brilliant. The 
creation of analytic geometry allowed us to represent points of the plane by pairs 
of numbers, and points of space by triples, and by means of operations with 
numbers, led to the discovery of ever new geometric facts. However, the success 
of analytic geometry is mainly based on the fact that it reduces to numbers not 
only points, but also curves, surfaces and so on. For example, a curve in the plane 
is given by an equation F(x, y) = 0; in the case of a line, F is a linear polynomial, 
and is determined by its 3 coefficients: the coefficients of x and y and the constant 
term. In the case of a conic section we have a curve of degree 2, determined by 
its 6 coefficients. If F is a polynomial of degree n then it is easy to see that it has 
4(n + 1)(n + 2) coefficients; the corresponding curve is determined by these 
coefficients in the same way that a point is given by its coordinates. 

In order to express as numbers the roots of an equation, the complex numbers 
were introduced, and this takes a step into a completely new branch of mathe- 
matics, which includes elliptic functions and Riemann surfaces. 

For a long time it might have seemed that the path indicated by Galileo 
consisted of measuring ‘everything’ in terms of a known and undisputed collec- 


§1. What is Algebra? 7 


tion of numbers, and that the problem consists just of creating more and more 
subtle methods of measurements, such as Cartesian coordinates or new physical 
instruments. Admittedly, from time to time the numbers considered as known 
(or simply called numbers) turned out to be inadequate: this led to a ‘crisis’, which 
had to be resolved by extending the notion of number, creating a new form of 
numbers, which themselves soon came to be considered as the unique possibility. 
In any case, as a rule, at any given moment the notion of number was considered 
to be completely clear, and the development moved only in the direction of 
extending it: 

‘1, 2, many’ > natural numbers => integers 

=> rationals => reals = complex numbers. 

But matrixes, for example, form a completely independent world of ‘number- 
like objects’, which cannot be included in this chain. Simultaneously with them, 
quaternions were discovered, and then other ‘hypercomplex systems’ (now called 
algebras). Infinitesimal transformations led to differential operators, for which 
the natural operation turns out to be something completely new, the Poisson 
bracket. Finite fields turned up in algebra, and p-adic numbers in number theory. 
Gradually, it became clear that the attempt to find a unified all-embracing 
concept of number is absolutely hopeless. In this situation the principle declared 
by Galileo could be accused of intolerance; for the requirement to ‘make mea- 
surable everything which is not yet so’ clearly discriminates against anything 
which stubbornly refuses to be measurable, excluding it from the sphere of 
interest of science, and possibly even of reason (and thus becomes a secondary 
quality or secunda causa in the terminology of Galileo). Even if, more modestly, 
the polemic term ‘everything’ is restricted to objects of physics and mathematics, 
more and more of these turned up which could not be ‘measured’ in terms of 
‘ordinary numbers’. 

The principle of coordinatisation can nevertheless be preserved, provided we 
admit that the set of ‘number-like objects’ by means of which coordinatisation 
is achieved can be just as diverse as the world of physical and mathematical 
objects they coordinatise. The objects which serve as ‘coordinates’ should satisfy 
only certain conditions of a very general character. 

They must be individually distinguishable. For example, whereas all points of 
a line have identical properties (the line is homogeneous), and a point can only 
be fixed by putting a finger on it, numbers are all individual: 3, 7/2, /2, m™ and so 
on. (The same principle is applied when newborn puppies, indistinguishable to 
the owner, have different coloured ribbons tied round their necks to distinguish 
them.) 

They should be sufficiently abstract to reflect properties common to a wide 
circle of phenomenons. 

Certain fundamental aspects of the situations under study should be reflected 
in operations that can be carried out on the objects being coordinatised: addition, 
multiplication, comparison of magnitudes, differentiation, forming Poisson 
brackets and so on. 
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We can now formulate the point we are making in more detail, as follows: 


Thesis. Anything which is the object of mathematical study (curves and surfaces, 
maps, symmetries, crystals, quantum mechanical quantities and so on) can be 
‘coordinatised’ or ‘measured’. However, for such a coordinatisation the ‘ordinary’ 
numbers are by no means adequate. 

Conversely, when we meet a new type of object, we are forced to construct (or 
to discover) new types of ‘quantities’ to coordinatise them. The construction and 
the study of the quantities arising in this way is what characterises the place of 
algebra in mathematics (of course, very approximately). 


From this point of view, the development of any branch of algebra consists of 
two stages. The first of these is the birth of the new type of algebraic objects out 
of some problem of coordinatisation. The second is their subsequent career, that 
is, the systematic development of the theory of this class of objects; this is 
sometimes closely related, and sometimes almost completely unrelated to the 
area in connection with which the objects arose. In what follows we will try not 
to lose sight of these two stages. But since algebra courses are often exclusively 
concerned with the second stage, we will maintain the balance by paying a little 
more attention to the first. 

We conclude this section with two examples of coordinatisation which are 
somewhat less standard than those considered up to now. 


Example 1. The Dictionary of Quantum Mechanics. In quantum mechanics, 
the basic physical notions are ‘coordinatised’ by mathematical objects, as follows. 


Physical notion Mathematical notion 


; Line ¢ in an o0-dimensional 
State of a physical system complex Hilbert space 
Scalar physical quantity Self-adjoint operator 


Simultaneously measurable . 
Commuting operators 


quantities 
Quantity taking a precise Operator having ¢ as eigenvector 
value A in a state p with eigenvalue A 


Set of values of quantities 


. Spectrum of an operator 
obtainable by measurement pe Pe 


Probability of transition 


from state ¢ to state w I(y, W)|, where |p| = |p| = 1 


Example 2. Finite Models for Systems of Incidence and Parallelism Axioms. 
We start with a small digression. In the axiomatic construction of geometry, we 
often consider not the whole set of axioms, but just some part of them; to be 
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Fig. 1 Fig. 2 


concrete we only discuss plane geometry here. The question then arises as to 
what realisations of the chosen set of axioms are possible: do there exists other 
systems of objects, apart from ‘ordinary’ plane geometry, for which the set of 
axioms is satisfied? We consider now a very natural set of axioms of ‘incidence 
and parallelism’. 

(a) Through any two distinct points there is one and only one line. 

(b) Given any line and a point not on it, there exists one and only one other 
line through the point and not intersecting the line (that is, parallel to it). 

(c) There exist three points not on any line. 

It turns out that this set of axioms admits many realisations, including some 
which, in stark contrast to our intuition, have only a finite number of points and 
lines. Two such realisations are depicted in Figures 1 and 2. The model of Figure 
1 has 4 points A, B, C, D and 6 lines AB, CD; AD, BC; AC, BD. That of Figure 
2 has 9 points, A, B, C, D, E, F, G, H, I and 12 lines ABC, DEF, GHI; ADG, 
BEH, CFI; AEI, BFG, CDH; CEG, BDI, AFH. The reader can easily verify that 
axioms (a), (b), (c) are satisfied; in our list of lines, the families of parallel lines are 
separated by semicolons. 

We return to our main theme, and attempt to ‘coordinatise’ the model of 
axioms (a), (b), (c) just constructed. For the first of these we use the following 
construction: write 0 and 1 for the property of an integer being even or odd 
respectively; then define operations on the symbols 0 and 1 by analogy with the 
way in which the corresponding properties of integers behave under addition 
and multiplication. For example, since the sum of an even and an odd integer is 
odd, we write 0 + 1 = 1, and so on. The result can be expressed in the ‘addition 
and multiplication tables’ of Figures 3 and 4. 

The pair of quantities © and 1 with the operations defined on them as above 
serve us in coordinatising the ‘geometry’ of Figure 1. For this, we give points 
coordinates (X, Y) as follows: 
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A=(0,0), B=(0,1), C=(1,0), D=(1,1). 


It is easy to check that the lines of the geometry are then defined by the linear 
equations: 


AB:1X =0; CD:1X =1; AD:1X +1Y=0; 
BC:1X +1Y=1; AC:1Y=0; BD: 1Y =1. 


In fact these are the only 6 nontrivial linear equations which can be formed using 
the two quantities 0 and 1. 

The construction for the geometry of Figure 2 1s similar, but slightly more 
complicated: suppose that we divide up all integers into 3 sets U, V and W as 
follows: 


U = integers divisible by 3, 
V = integers with remainder 1 on dividing by 3, 


W = integers with remainder 2 on dividing by 3. 


The operations on the symbols U, V, W is defined as in the first example; for 
example, a number in V plus a number in W always gives a number in U, and 
so we set V + W = U; similarly, the product of two numbers in W 1s always a 
number in V, so we set W: W = V. The reader can easily write out the corre- 
sponding addition and multiplication tables. 

It is then easy to check that the geometry of Figure 2 is coordinatised by our 
quantities U, V, W as follows: the points are 


A=(U,U), B=(U,V), C=(U,W), D=(V,U) E=(V,V), 
F=(V,W), G=(W,U) H=(W,V), I=(W,W); 


and the lines are again given by all possible linear equations which can be written 
out using the three symbols U, V, W; for example, AFH is given by VX + VY = 
U, and DCH by VX + WY = V. 

Thus we have constructed finite number systems in order to coordinatise finite 
geometries. We will return to the discussion of these constructions later. 

Already these few examples give an initial impression of what kind of objects 
can be used in one or other version of ‘coordinatisation’. First of all, the collection 
of objects to be used must be rigorously delineated; in other words, we must 
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indicate a set (or perhaps several sets) of which these objects can be elements. 
Secondly, we must be able to operate on the objects, that is, we must define 
operations, which from one or more elements of the set (or sets) allow us to 
construct new elements. For the moment, no further restrictions on the nature of 
the sets to be used are imposed; in the same way, an operation may be a com- 
pletely arbitrary rule taking a set of k elements into a new element. All the same, 
these operations will usually preserve some similarities with operations on 
numbers. In particular, in all the situations we will discuss, k = 1 or 2. The basic 
examples of operations, with which all subsequent constructions should be 
compared, will be: the operation at» —a taking any number to its negative; the 
operation b++b™! taking any nonzero number b to its inverse (for each of these 
k = 1); and the operations (a, b)t+a + b and ab of addition and multiplication 
(for each of these k = 2). 


§2. Fields 


We start by describing one type of ‘sets with operations’ as described in § 1 
which corresponds most closely to our intuition of numbers. 

A field is aset K on which two operations are defined, each taking two elements 
of K into a third; these operations are called addition and multiplication, and the 
result of applying them to elements a and b is denoted by a + b and ab. The 
operations are required to satisfy the following conditions: 


Addition: 

Commutativity: a + b = b + a; 

Associativity:a + (b+ c)=(a+b)+ cc; 

Existence of zero: there exists an element 0 € K such that a + 0 = a for every 
a (it can be shown that this element is unique); 

Existence of negative: for any a there exists an element (—a) such that 
a + (—a) = 0 (it can be shown that this element is unique). 
Multiplication: 

Commutativity: ab = ba; 

Associativity: a(bc) = (ab)c; 

Existence of unity: there exists an element 1 € K such that al = a for every a 
(it can be shown that this element is unique); 

Existence of inverse: for any a #0 there exists an element a™' such that 
aa~' = 1 (it can be shown that for given a, this element is unique). 
Addition and multiplication: 

Distributivity: a(b + c) = ab + ac. 

Finally, we assume that a field does not consist only of the element 0, or 
equivalently, that 0 ¥ 1. 
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These conditions taken as a whole, are called the field axioms. The ordinary 
identities of algebra, such as 


(a + b)? = a? + 2ab + b? 
or 
a*'—(a+1)*=a'(a+ 1)" 


follow from the field axioms. We only have to bear in mind that for a natural 
number n, the expression na means a + a+°-: +a (n times), rather than the 
product of a with the number n (which may not be in K). 

Working over an arbitrary field K (that is, assuming that all coordinates, 
coefficients, and so on appearing in the argument belong to K) provides the most 
natural context for constructing those parts of linear algebra and analytic 
geometry not involving lengths, polynomial algebras, rational fractions, and 
SO ON. 

Basic examples of fields are the field of rational numbers, denoted by Q, the 
field of real numbers R and the field of complex numbers C. 

If the elements of a field K are contained among the elements of a field L and 
the operations in K and L agree, then we say that K is a subfield of L, and L an 
extension of K, and we write K c L. For example,Q cRcC. 


Example 1. In §1, in connection with the ‘geometry’ of Figure 1, we defined 
operations of addition and multiplication on the set {0,1}. It is easy to check 
that this is a field, in which 0 is the zero element and 1 the unity. If we write 0 
for 0 and 1 for 1, we see that the multiplication table of Figure 4 is just the rule 
for multiplying 0 and 1 in Q, and the addition table of Figure 3 differs in that 
1 + 1 =0. The field constructed in this way consisting of @ and 1 is denoted by 
F,. Similarly, the elements U, V, W considered in connection with the geometry 
of Figure 2 also form a field, in which U = 0, V = 1 and W = —1. We thus obtain 
examples of fields with a finite number (2 or 3) of elements. Fields having only 
finitely many elements (that is, finite fields) are very interesting objects with many 
applications. A finite field can be specified by writing out the addition and 
multiplication tables of its elements, as we did in Figures 3—4. In § 1 we met such 
fields in connection with the question of the realisation of a certain set of axioms 
of geometry in a finite set of objects; but they arise just as naturally in algebra 
as realising the field axioms in a finite set of objects. A field consisting of q 
elements is denoted by F,. 


Example 2. An algebraic expression obtained from an unknown x and arbi- 
trary elements of a field K using the addition, multiplication and division opera- 
tions, can be written in the form 

Ag + a,x +°°* + a,x" 


a 1 
by + by x +++ + Dy x” ©) 


where a;, b; €« K and not all b, = 0. An expression of this form is called a rational 
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fraction, or a rational function of x. We can now consider it as a function, taking 
any x in K (or any x in L, for some field L containing K) into the given expression, 
provided only that the denominator is not zero. All rational functions form a 
field, called the rational function field; it is denoted by K(x). We will discuss 
certain difficulties connected with this definition in §3. The elements of K are 
contained among the rational functions as ‘constant’ functions, so that K(x) 1s 
an extension of K. 

In a similar way we define the field K(x,y) of rational functions in two 
variables, or in any number of variables. 

An isomorphism of two fields K’ and K” is a 1-to-1 correspondence a’ <a” 
between their elements such that a’ a” and b’'«b”" implies that a’ + b’o 
a” + b” and a’b<a"b"; we say that two fields are isomorphic if there exists an 
isomorphism between them. If L’ and L” are isomorphic fields, both of which are 
extensions of the same field K, and if the isomorphism between them takes each 
element of K into itself, then we say that it is an isomorphism over K, and that 
L' and L” are isomorphic over K. An isomorphism of fields K’ and K” is denoted 
by K’ = K". If L’ and L” are finite fields, then to say that they are isomorphic 
means that their addition and multiplication tables are the same; that is, they 
differ only in the notation for the elements of L’ and L”. The notion of 
isomorphism for arbitrary fields is similar in meaning. 

For example, suppose we take some line a and mark a point O and a ‘unit 
interval’ OE on it; then we can in a geometric way define addition and multiplica- 
tion on the directed intervals (or vectors) contained in a. Their construction is 
given in Figures 5-6. In Figure 5, b is an arbitrary line parallel to a and U an 
arbitrary point on it, OU || AV and VC|| UB; then OC = OA + OB. In Figure 6, 
b is an arbitrary line passing through O, and EU || BV and VC||UA; then OC = 
OA: OB. 


Fig. 5 Fig. 6 


With this definition of the operations, intervals of the line form a field P; to 
verify all the axioms 1s a sequence of nontrivial geometric problems. Taking each 
interval into a real number, for example an infinite decimal fraction (this is again 
a process of measurement!), we obtain an isomorphism between P and the real 
number field R. 
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Example 3. We return now to the plane curve given by F(x, y) = 0, where F is 
a polynomial; let C denote the curve itself. Taking C into the set of coefficients 
of F is one very primitive method of ‘coordinatising’ C. We now describe another 
method, which is much more precise and effective. 

It is not hard to show that any nonconstant polynomial F(x, y) can be fac- 
torised as a product of a number of nonconstant polynomials, each of which 
cannot be factorised any further. If F = F,-F,...F, 18 such a factorisation then 
our curve with equation F = 0 is the union of k curves with equations F, = 0, 
F, = 0,..., F, = 0 respectively. We say that a polynomial which does not fac- 
torise as a product of nonconstant polynomials is irreducible. From now on we 
will assume that F is irreducible. 

Consider an arbitrary rational function @(x, y) in two variables; @ is repre- 
sented as a ratio of two polynomials: 


P ) 
~(x, y) = Oy) 


Q(x, y)’ 


and we suppose that the denominator Q is not divisible by F. Consider @ as a 
function on points of C only; it is undefined on points (x, y) where both Q(x, y) = 0 
and F(x, y) = 0. It can be proved that under the assumptions we have made there 
are only finitely many such points. In order that our treatment from now on has 
some content, we assume that the curve C has infinitely many points (that 1s, we 
exclude curves such as x? + y? = —1,x* + y* = Oand so on; if we also consider 
points with complex coordinates, then the assumption is not necessary). Then 
(x, y) defines a function on the set of points of C (for short, we say on C), possibly 
undefined at a finite number of points—in the same way that the rational 
function (1) is undefined at the finite number of values of x where the denominator 
of (1) vanishes. Functions obtained in this way are called rational functions on 
C. It can be proved that all the rational functions on a curve C form a field (for 
example, one proves that a function @ defines a nonzero function on C only if 
_ Q%Y) 

P(x, y) 
condition required for @, that the denominator is not divisible by F; this proves 
the existence of the inverse). The field of rational functions on C is denoted by 
R(C); it is an extension of the real number field R. Considering points with co- 
ordinates in an arbitrary field K, it is easy to replace R by K in this construction. 

Assigning to a curve C the field K(C) is a much more precise method of 
‘coordinatising’ C than the coefficients of its equation. First of all, passing from 
a coordinate system (x,y) to another system (x’, y’), the equation of a curve 
changes, but the field K(C) is replaced by an isomorphic field, as one sees easily. 
Another important point is that an isomorphism of fields K(C) and K(C’) 
establishes important relations between curves C and C’. 

Suppose as a first example that C is the x-axis. Then since the equation of C 
is y = O, restricting a function @ to C we must set y = 0 in (2), and we get a 
rational function of x: 


(2) 


satisfies the 


P(x, y) is not divisible by F(x, y), and then the function og’ 
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P(x, 0) 
p(x, 9) O(x, 0) 

Thus in this case, the field K(C) is isomorphic to the rational function field 
K(x). Obviously, the same thing holds if C is an arbitrary line. 

We proceed to the case of a curve C of degree 2. Let us prove that in this case 
also the field K(C) is isomorphic to the field of rational functions of one variable 
K(t). For this, choose an arbitrary point (x9, y)) on C and take t to be the slope 
of the line joining it to a point (x, y) with variable coordinates (Figure 7). 


(Xg1 Yo) 


Fig. 7 


In other words, set t = y~ Yo 
xX — Xo 


y, as functions on C, are rational functions of t. For this, recall that y — y,) = 
t(x — Xo), and if F(x, y) = 0 is the equation of C, then on C we have 


F(x, Vo + t(x — Xo)) = 0. (3) 


In other words, the relation (3) is satisfied in K(C). Since C is a curve of degree 
2, this is a quadratic equation for x: a(t)x? + b(t)x + c(t) = 0 (whose coefficients 
involve t). However, one root of this equation is known, namely x = xj; this 
simply reflects the fact that (x9, yo) is a point of C. The second root is then 


b(t 
obtained from the condition that the sum of the roots equals _P) We get 


a(t) 
an expression x = f(t) as a rational function of t, and a similar expression 
y=4g(t); of course, F(f(t),g(t))=0. Thus taking x f(t), yoog(t) and 
o(x, y)< o(S(t), g(t), we obtain an isomorphism of K(C) and K(t) over K. 

The geometric meaning of the isomorphism obtained in this way is that points 
of C can be parametrised by rational functions: x = f(t), y = g(t). If C has 
the equation y* = ax? + bx +c then on C we have y = ./ax” + bx +c, and 
another form of the result we have obtained is that both x and ./ax? + bx + c 
can be expressed as rational functions of some third function t. This expression 
is useful, for example, in evaluating indeterminate integrals: it shows that any 


, as a function on C. We now prove that x and 
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integral 
| os, /ax? + bx + c)dx, 


where @ is a rational function, reduces by substitutions to integrals of a rational 
function of t, and can hence be expressed in terms of elementary functions. In 
analysis our substitutions are called Euler substitutions. We mention two further 
applications. 

(a) The field of trigonometric functions is defined as the field of all rational 
functions of sing and cos gq. Since sin? g + cos? g = 1, this field is isomorphic 
to R(C), where C is the circle with equation x? + y* = 1. We know that R(C) is 
isomorphic to R(t). This explains why each trigonometric equation can be 
reduced to an algebraic equation. 


(b) In the case of the circle x? + y* = 1, if we set x9 =0, yo = —1, our 
construction gives the formulas 
2t 1 —t? 
=. — 4 
*T4P ye 14 4) 


A problem of number theory which goes back to antiquity is the question 
. a b 
of finding integers a, b, c for which a* + b? = c’. Setting -=x,-=y, t =" 
c c 
and reducing formula (4) to common denominators, we get the well-known 
expression 


a=2pq, b=q’?—p*, c=q? +p’. 


Already for the curve C with equation y? = x? + 1 the field K(C) is not isomor- 
phic to the field of rational functions. This is closely related to the fact that an 


elliptic integral, for example cannot be expressed in terms of elemen- 


| dx 
tary functions. Vx +1 

Of course, the field K(C) also plays an important role in the study of other 
curves. It can also be defined for surfaces, given by F(x, y,z) = 0, where F 1s a 
polynomial, and if we consider spaces of higher dimensions, for an even wider 
class of geometric objects, algebraic varieties, defined in an n-dimensional space 
by an arbitrary system of equations F, = 0,..., F,, = 0, where the F; are poly- 
nomials in n variables. 

In conclusion, we give examples of fields which arise in analysis. 


Example 4. All meromorphic functions on some connected domain of the plane 
of one complex variable (or on an arbitrary connected complex manifold) form 
a field. 


ie. @) 
Example 5. Consider the set of all Laurent series )° a,z" which are con- 


n=—k 


vergent in an annulus 0 < |z| < R (where different series may have different 
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annuluses of convergence). With the usual definition of operations on series, these 
form a field, the field of Laurent series. If we use the same rules to compute the 
coefficients, we can define the sum and product of two Laurent series, even if 
these are nowhere convergent. We thus obtain the field of formal Laurent series. 
One can go further, and consider this construction in the case that the coefficients 
a, belong to an arbitrary field K. The resulting field is called the field of formal 
Laurent series with coefficients in K, and is denoted by K((z)). 
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The simplest possible example of ‘coordinatisation’ is counting, and it leads 
(once 0 and negative numbers have been introduced) to the integers, which do 
not form a field. Operations of addition and multiplication are defined on the 
set of all integers (positive, zero or negative), and these satisfy all the field axioms 
but one, namely the existence of an inverse element a™' for every a # 0 (since, 
for example, 5 is already not an integer). 

A set having two operations, called addition and multiplication, satisfying all 
the field axioms except possibly for the requirement of existence of an inverse 
element a~' for every a # 0 is called a commutative ring; it is convenient not to 
exclude the ring consisting just of the single element 0 from the definition. 

The field axioms, with the axiom of the existence of an inverse and the 
condition 0 # 1 omitted will from now on be referred to as the commutative ring 
axioms. 

By analogy with fields, we define the notions of a subring A < B of a ring, and 
isomorphism of two rings A’ and A”; in the case that A c A’ and A c A” we also 
have the notion of an isomorphism of A’ and A” over A; an isomorphism of rings 
is again written A’ = A”. 


Example 1. The Ring of Integers. This is denoted by Z; obviously Z < Q. 


Example 2. An example which is just a fundamental is the polynomial ring 
A[x] with coefficients in a ring A. In view of its fundamental role, we spend some 
time on the detailed definition of A[x]. First we characterise it by certain 
properties. 

We say that a commutative ring B is a polynomial ring over a commutative 
ring A if B > Aand Bcontains an element x with the property that every element 
of B can be uniquely written in the form 


Ag tayx +°°' + a,x" with a; € A 


for some n > 0. If B’ is another such ring, with x’ the corresponding element, the 
correspondence 
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Ag + ayX ++ a,x" —& ag t a,x’ +++ +a,(x')” 


defines an isomorphism of B and B’ over A, as one sees easily. Thus the poly- 
nomial ring is uniquely defined, in a reasonable sense. 

However, this does not solve the problem as to its existence. In most cases the 
‘functional’ point of view is sufficient: we consider the functions f of A into itself 
of the form 


f(c) =a) + a,c +°'- + a,c" force A. (1) 


Operations on functions are defined as usual: (f + g)(c) = f(c) + g(c) and 
(f9)(c) = f(c)g(c). Taking an element ae A into the constant function f(c) = a, 
we can view A as a subring of the ring of functions. If we let x denote the function 
x(c) = c then the function (1) is of the form 


f=agta,xt+-:++ a,x". (2) 


However, in some cases (for example if the number of elements of A is finite, and 
n is greater than this number), the expression (2) for f may not be unique. Thus 
in the field F, of §2, Example 1, the functions x and x? are the same. For this 
reason we give an alternative construction. 

We could define polynomials as ‘expressions’ ay + a,x +°:: + a,x", with + 
and x' thought of as conventional signs or place-markers, serving to denote the 
sequence (d,...,a,) of elements of a field K. After this, sum and product are 
given by formulas 


»y a,x" + y b,x" = y (a, + b,)x*, 
k k k 


k+l=m 


(5 ax (5 bx!) =) c,x™  wherec, = > agby. 
k l m 


Rather more concretely, the same idea can be formulated as follows. We consider 
the set of all infinite sequences (dp, a,,...,a,,...) of elements of a ring A, every 
sequence consisting of zeros from some term onwards (this term may be different 
for different sequences). First we define addition of sequences by 


(Ap, A1,.--5Qns---) + (Dp, D,,.--5 Dy +) = (dy + bo, ay + b,,..+5Qy + b,,...). 


All the ring axioms concerning addition are satisfied. Now for multiplication we 
define first just the multiplication of sequences by elements of A: 


A(Ap, @1,-.-5Ay,-..) = (Aap, AA,,...,AA,,...). 


We write E, = (0,...,1,0,...) for the sequence consisting of 1 in the kth place 
and 0 everywhere else. Then it is easy to see that 


(40, Bis--1dnr--- = 2 a, E,. (3) 


>0 
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Here the right-hand side is a finite sum in view of the condition imposed on 
sequences. Now define multiplication by 


(5 ols (5 bE.) = y Oy bE x41 (4) 
k I k,l 


(on the right-hand side we must gather together all the terms for k and | with 
k + 1 = nas the coefficient of E,,). It follows from (4) that E, is the unit element 
of the ring, and E, = E{. Setting E, = x we can write the sequence (3) in the form 
)' a,x*. Obviously this expression for the sequence is unique. It is easy to check 
that the multiplication (4) satisfies the axioms of a commutative ring, so that the 
ring we have constructed is the polynomial ring A[x]. 

The polynomial ring A[x, y] is defined as A[x][y], or by generalising the 
above construction. In a similar way one defines the polynomial ring A[x,,..., x, ] 
in any number of variables. 


Example 3. All linear differential operators with constant (real) coefficients can 


. Lae 7) 
be written as polynomials in the operators ——,..., —. 
OX, OX, 


| 
al 


Sending — to t; defines an isomorphism 


Ox; 
R| é | Semmes 


—— 
OX, OXp 


Hence they form a ring 


If A = K is a field then the polynomial ring K[x] is a subring of the rational 
function field K(x), in the same way that the ring of integers Z is a subring of the 
rational field @. A ring which is a subring of a field has an important property: 
the relation ab = 01s only possible in it if either a = 0 or b = 0; indeed, it follows 
easily from the commutative ring axioms that a:0 = 0 for any a. Hence if ab = 0 
in a field and a £ 0, multiplying on the left by a“! gives b = 0. Obviously the 
same thing holds for a ring contained in a field. 

A commutative ring with the properties that for any of its elements a, b the 
product ab = 0 only if a = 0 or b = O, and that 0 ¥ 1, is called an integral ring 
or an integral domain. Thus a subring of any field is an integral domain. 


Theorem I. For any integral domain A, there exists a field K containing A asa 
subring, and such that every element of K can be written in the form ab™ with 
a,beé Aandb #0. A field K with this property is called the field of fractions of 
A; it is uniquely defined up to isomorphism. 


For example, the field of fractions of Z is Q, that of the polynomial ring K [x] 
is the field of rational functions K(x), and that of K[x,,...,x,] iS K(x,,---,X,)- 
Quite generally, fields of fractions give an effective method of constructing new 
fields. 
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Example 4. If A and B are two rings, their direct sum is the ring consisting of 
pairs (a,b) with a € A and be B, with addition and multiplication give by 


(a,,b,) + (a2, 52) = (a, + a,b, + 5), 
(a,,5,)(a2,b2) = (A, a,b, b,). 


Direct sum is denoted by A @ B. The direct sum of any number of rings is defined 
in a similar way. 

A direct sum is not an integral domain: (a, 0)(0, b) = (0,0), which is the zero 
element of A @ B. 

The most important example of commutative rings, which includes non- 
integral rings, is given by rings of functions. Properly speaking, the direct sum 
A@®:::@A of n copies of A can be viewed as the ring of function on a set of 
n elements (such as {1,2,...,n}) with values in A: the element (a,,...,a,)¢ 
A@®:::@A can be identified with the function f given by /f(i) = a;. Addition 
and multiplication of functions are given as usual by operating on their values. 


Example 5. The set of all continuous functions (to be definite, real-valued 
functions) on the interval [0,1] forms a commutative ring @ under the usual 
definition of addition and multiplication of functions. This is not an integral 
domain: if f and g are the functions depicted in Figures 8 and 9, then obviously 
fg = 0. In the definition, we could replace real-valued functions by complex- 
valued ones, and the interval by an arbitrary topogical space. Rings of this form 
occuring in analysis are usually considered together with a topology on their set 
of elements, or a norm defining a topology. For example, in our case it 1s standard 
to consider the norm 

Ifll = Sup |f()l. 
O<x<l 

Examples analogous to those of Figures 8 and 9 can also be constructed in 

the ring of C® functions on the interval. 


4/2 4 x 4/2 f x 


Fig. 8 Fig. 9 


Example 6. The ring of functions of 1 complex variable holomorphic at the 
origin is an integral domain, and its field of fractions is the field of Laurent series 
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(§ 2, Example 5). Similarly to § 2, Example 5 we can define the ring of formal power 

series ) a,t" with coefficients a, in any field K. This can also be constructed 
n=0 

as in Example 2, if we just omit the condition that the sequences (dy, @,,...,4,,..-) 

are 0 from some point onwards. This is also an integral domain, and its field of 

fractions is the field of formal Laurent series K((t)). The ring of formal power 

series is denoted by K[t]. 


Example 7. The ring ©, of functions in n complex variables holomorphic at the 
origin, that is of functions that can be represented as power series 


» Q:,..i, 21 - Zn", 
convergent in some neighbourhood of the origin. By analogy with Example 6 


we can define the rings of formal power series C[z,,...,z,|| with complex coef- 
ficients, and K[z,,...,2, || with coefficients in any field K. 


Example 8. We return to the curve C defined in the plane by the equation 
F(x, y) = 0, where F is a polynomial with coefficients in a field K, as considered 
in § 2. With each polynomial P(x, y) we associate the function on the set of points 
of C defined by restricting P to C. Functions of this form are polynomial functions 
on C. Obviously they form a commutative ring, which we denote by K[C]. If F 
is a product of factors then the ring K[C] may not be an integral domain. For 
example if F = xy then C is the union of the coordinate axes; then x is zero on 
the y-axis, and y on the x-axis, so that their product is zero on the whole curve 
C. However, if F is an irreducible polynomial then K[C] is an integral domain. 
In this case the field of fractions of K[C] is the rational function field K(C) of C; 
the ring k[(C] is called the coordinate ring of C. 

Taking an algebraic curve C into the ring K[C] is also an example of 
‘coordinatisation’, and in fact is more precise than taking C to K(C), since K[C] 
determines K(C) (as its field of fractions), whereas there exist curves C and C’ for 
which the fields K(C) and K(C’) are isomorphic, but the rings K[C] and K[C’] 
are not. 

Needless to say, we could replace the algebraic curve given by F(x, y) = 0 by 
an algebraic surface given by F(x, y,z) = 0, and quite generally by an algebraic 
variety. 


Example 9. Consider an arbitrary set M, and the commutative ring A con- 
sisting of all functions on M with values in the finite field with two elements F, 
(§ 2, Example 1). Thus A consists of all maps from M to F,. Since F, has only two 
elements 0 and 1, a function with values in F, is uniquely determined by the subset 
U < M of elements on which it takes the value 1 (on the remainder it takes the 
value 0). Conversely, any subset U < M determines a function gy with gy(m) = 1 
if me U and @,(m) = 0 if m¢ U. It is easy to see which operations on subsets 
correspond to the addition and multiplication of functions: 


Oy’ Py =Punv and y+ Oy = Qyay> 
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where U A V is the symmetric difference, U 4 V = (U UV) x (UC V). Thus our 
ring can be described as being made up of subsets U c V with the operations of 
symmetric difference and intersection as sum and product. This ring was in- 
troduced by Boole as a formal notation for assertions in logic. Since x* = x for 
every element of F., this relations holds for any function with values in F,, that 
is, it holds in A. A ring for which every element x satisfies x? = x is a Boolean ring. 

More general examples of Boolean rings can be constructed quite similarly, 
by taking not all subsets of M, but only some system S of subsets containing 
together with U and V the subsets Un V and U U V, and together with U its 
complement. For example, we could consider a topological space having the 
property that every open set is also closed (such a space is called 0-dimensional), 
and let S be the set of open subsets of M. It can be proved that every Boolean 
ring can be obtained in this way. In the following section § 4 we will indicate the 
principle on which the proof of this is based. 

The qualitatively new phenomenon that occurs on passing from fields to 
arbitrary commutative rings is the appearance of a nontrivial theory of divisi- 
bility. An element a of a ring A is divisible by an element b if there exists c in A 
such that a = bc. A field is precisely a ring in which the divisibility theory is 
trivial: any element is divisible by any nonzero element, since a = b(ab~'). The 
classical example of divisibility theory is the theory of divisibility in the ring Z: 
this was constructed already in antiquity. The basic theorem of this theory is the 
fact that any integer can be uniquely expressed as a product of prime factors. 
The proof of this theorem, as is well known, is based on division with remainder 
(or the Euclidean algorithm). 

Let A be an arbitrary integral domain. We say that an element ae A 1s 
invertible or is a unit of A if it has an inverse in A; in Z the units are +1,in K[x] 


the nonzero constants c € K, and in K[x] the series )) a,x" with ay # 0. Any 
i=0 


element of A is divisible by a unit. An element a is said to be prime if its only 
factorisations are of the form a = c(c"‘a) where c is a unit. If an integral domain 
A has the property that every nonzero element can be written as a product of 
primes, and this factorisation is unique up to renumbering the prime factors and 
multiplication by units, we say that A is a unique factorisation domain (UFD) or 
a factorial ring. Thus Z is a UFD, and so is K[x] (the proof uses division with 
remainder for polynomials). It can be proved that if A isa UFD then so is A[x]; 
hence A[x,,...,X,] is also a UFD. The prime elements of a polynomial ring are 
called irreducible polynomials. In C[x] only the linear polynomials are irreduci- 
ble, and in R[x] only linear polynomials and quadratic polynomials having no 
real roots. In Q[x] there are irreducible polynomials of any degree, for example 
the polynomial x” — p where p is any prime number. 

Important examples of UFDs are the ring ©, of functions in n complex 
variables holomorphic at the origin, and the formal power series ring 
K[t,,...,t,] (Example 7). The proof that these are UFDs is based on the 
Weierstrass preparation theorem, which reduces the problem to functions (or 
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formal power series) which are polynomials in one of the variables. After this, 
one applies the fact that A[t] is a UFD (provided A is) and an induction. 


Example 10. The Gaussian Integers. It is easy to see that the complex numbers 
of the form m + ni, where m and n are integers, form a ring. This is a UFD, as 
can also be proved using division with remainder (but the quantity that decreases 
on taking the remainder is m? + n7). Since in this ring 


m? + n? =(m + ni)(m — ni), 


divisibility in it can be used as the basis of the solution of the problem of 
representing integers as the sum of two squares. 


Example 11. Let ¢ be a (complex) root of the equation e* + ¢ + 1 = 0. Complex 
numbers of the form m + ne, where m and n are integers, also form a ring, which 
is also a UFD. In this ring the expression m> + n°? factorises as a product: 


m> +n? =(m+n)(m + ne)(m + ne), 


where € = ¢? = —(1 + £) is the complex conjugate of «. Because of this, divisi- 
bility theory in this ring serves as the basis of the proof of Fermat’s Last Theo- 
rem for cubes. The 18th century mathematicians Lagrange and Euler were 
amazed to find that the proof of a theorem of number theory (the theory of 
the ring Z) can be based on introducing other numbers (elements of other 
rings). . 


Example 12. We give an example of an integral domain which is not a UFD; 
this is the ring consisting of all complex numbers of the form m + n./ —5 where 
m, ne Z. Here is an example of two different factorisations into irreducible 


factors: 
37 = (2+ ./—5)(2 — ./—5) 

We need only check that 3, 2 + ./—5 and 2 — ./—S are irreducible elements. 
For this, we write N(a) for the square of the absolute value of a; ifo =n + m./—S5 
then N(a) = (n + m./—5)(n — m./ —5) = n* + 5m’, which is a positive integer. 
Moreover, it follows from the properties of absolute value that N(aB) = 
N(a)N(B). If, say, 2+ ./—5 is reducible, for example 2 + ./—5 = af, then 
N(2 + ./ —5) = N(a)N(B). But N(2 + ./ —5) = 9, and hence there are only three 
possibilities: (N (a), N(B)) = (3, 3) or (1, 9) or (9, 1). The first of these is impossible, 
since 3 cannot be written in the form n? + 5m? with n, m integers. In the second 
B = +1 and in the third « = +1,so4 or Bisa unit. This proves that 2 + ./ —5 
is irreducible. 

To say that a ring is not a UFD does not mean to say that it does not have 
an interesting theory of divisibility. On the contrary, in this case the theory of 
divisibility becomes especially interesting. We will discuss this in more detail in 
the following section § 4. 
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§4. Homomorphisms and Ideals 


A further difference of principle between arbitrary commutative rings and 
fields is the existence of nontrivial homomorphisms. A homomorphism of a ring 
A toaring Bisa map f: A> B such that 


f(a, + a) = flay) + faz), f(ayaz) = f(a): f(a.) and f(l,) = 1, 


(we write 1, and 1, for the identity elements of A of B). An isomorphism is a 
homomorphism having an inverse. 

If a ring has a topology, then usually only continuous homomorphisms are of 
interest. 

Typical examples of homomorphisms arise if the rings A and B are realised as 
rings of functions on sets X and Y (for example, continuous, differentiable or 
analytic functions, or polynomial functions on an algebraic curve C). A map 
g: Y > X transforms a function F on X into the function g*F on Y defined by 
the condition 


(p*F)(y) = F(e(y)). 


If @ satisfies the natural conditions for the theory under consideration (that is, 
if g is a continuous, differentiable or analytic map, or is given by polynomial 
expressions) then o* defines a homomorphism of A to B. The simplest particu- 
lar case is when ¢ is an embedding, that is Y is a subset of X. Then g* is simply 
the restriction to Y of functions defined on X. 


Example 1. If C is a curve, defined by the equation F(x, y) =0 where 
F € K[x, y] is an irreducible polynomial, then restriction to C defines a homo- 
morphism K[x, y] ~ K[C]. 

The case which arises most often is when Y is one point of a set X, that is 
Y = {xo} with x, € X; then we are just evaluating a function, taking it into its 
value at Xo. 


Example 2. If x ) e C then taking each function of K[C] into its value at x, 
defines a homomorphism K[C] — K. 


Example 3. If ¢ is the ring of continuous functions on [0,1] and x, € [0,1] 
then taking a function 9 € @ into its value @(x,) is a homomorphism ¢ — R. If 
A is the ring of functions which are holomorphic in a neighbourhood of 0, then 
taking g € A into its value @(0) is a homomorphism A —> C. 

Interpreting the evaluation of a function at a point as a homomorphism has 
led to a general point of view on the theory of rings, according to which a 
commutative ring can very often be interpreted as a ring of functions on a set, 
the ‘points’ of which correspond to homomorphisms of the original ring into 
fields. The original example is the ring K[[C], where C is an algebraic variety, 
and from it, the geometric intuition spreads out to more general rings. Thus 
the concept that ‘every geometric object can be coordinatised by some ring of 
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functions on it’ is complemented by another, that ‘any ring coordinatises some 
geometric object’. 

We have already run into these two points of view, the algebraic and func- 
tional, in the definition of the polynomial ring in §3. The relation between the 
two will gradually become deeper and clearer in what follows. 


Example 4. Consider the ring A of functions which are holomorphic in the disc 
|z| < 1 and continuous for |z| < 1. In the same way as before, any point Z, with 
|Z9| < 1 defines a homomorphism A = C, taking a function g € A into ¢(Zp). It 
can be proved that all homomorphisms A —> C over C are provided in this way. 
Consider the boundary values of functions in A; these are continuous functions 
on the circle |z| = 1, whose Fourier coefficients with negative index are all zero, 


that is, with Fourier expansions of the form )° c,e?"""®. Since a function f € A 
n>0 
is determined by its boundary values, A is isomorphic to the ring of continuous 


functions on the circle with Fourier series of the indicated type. However, 1n this 
interpretation, only the homomorphisms of A corresponding to points of the 
boundary circle |z| = 1 are immediately visible. Thus considering the set of all 
homomorphisms sometimes helps to reestablish the set on which the elements 
of the ring should naturally be viewed as functions. 

In the ring of functions which are holomorphic and bounded for |z| < 1, by 
no means all homomorphisms are given in terms of points Z, with |z)| < 1. The 
study of these is related to delicate questions of the theory of analytic functions. 

For a Boolean ring (see §3, Example 9), it is easy to see that the image of a 
homomorphism g: A— F in a field F is a field with two elements. Hence, 
conversely, any element a € A sends a homomorphism ¢ to the element ¢(a) € F,. 
This is the idea of the proof of the main theorem on Boolean rings: for M one 
takes the set of all homomorphisms A — F,, and A is interpreted as a ring of 
functions on M with values in F,. 


Example 5. Let % be a compact subset of the space C" of n complex variables, 
and A the ring of functions which are uniform limits of polynomials on %. 
The homomorphisms A —> C over C are not exhausted by those corresponding 
to points ze #; they are in 1-to-1 correspondence with points of the so- 
called polynomial convex hull of 4%, that is with the points z eC” such that 
| f(z)| < Sup | f] for every polynomial f. 

H 


Example 6. Suppose we assign to an integer the symbol 0 if it is even and 1 
if it is odd. We get a homomorphism Z - F, of the ring of integers to the field 
with 2 elements F, (addition and multiplication tables of which were given in § 1, 
Figures 3 and 4). Properly speaking, the operations on © and 1 were defined in 
order that this map should be a homomorphism. 

Let f: A > B be a homomorphism of commutative rings. The set of elements 
f(a) with a € A forms a subring of B, as one sees from the definition of homomor- 
phism; this is called the image of f, and is denoted by Im f or f(A). The set of 
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elements a € A for which f(a) = 0 1s called the kernel of f, and denoted by Ker f. 
If B = Im f then we say that B is a homomorphic image of A. 

If Ker f = 0 then f 1s an isomorphism from A to the subring f(A) of B; for if 
f(a) = f(b) then it follows from the definition of homomorphism that f(a — b) = 0, 
that is, a — be Ker f = 0 and soa = b. Thus f is a 1-to-1 correspondence from 
A to f(A), and hence an isomorphism. This fact draws our attention to the 
importance of the kernels of homomorphisms. 

It follows at once from the definitions that if a,, a, ¢ Ker f then a, + a,€ 
Ker f, and if a € Ker f then ax e Ker f for any x € A. We say that a nonempty 
subset J of a ring A 1s an ideal if it satisfies these two properties, that is, 


a,,a,El=>a,+a,EI1, and aéel=axel foranyxeA. 


Thus the kernel of any homomorphism is an ideal. A universal method of 
constructing ideals is as follows. For an arbitrary set {a,} of elements of 4A, 
consider the set I of elements which can be represented in the form )' x,a, for 
some x, € A (we assume that only a finite number of nonzero terms appears in 
each sum). Then I is an ideal; it is called the ideal generated by {a,}. Most 
commonly the set {a, } is finite. An ideal J = (a) generated by a single element is 
called a principal ideal. If a divides b then (b) < (a). 

A field K has only two ideals, (0) and (1) = K. For if I c K is an ideal of K 
and 0 £aeJ then! 3 aa™'b = b for any be K, and hence I = K (this is another 
way of saying that the theory of divisibility is trivial in a field). It follows from 
this that any homomorphism K — B from a field is an isomorphism with some 
subfield of B. 

Conversely, if a commutative ring A does not have any ideals other than (0) 
and (1), and O # 1 then A 1s a field. Indeed, then for any element a 4 0 we must 
have (a) = A, and in particular 1 € (a), so that 1 = ab for some b € A, and a has 
an inverse. 

In the ring of integers Z, any ideal J 1s principal: it is easy to see that if I 4 (0) 
then J = (n), where n is the smallest positive integer contained in J. The same 1s 
true of the ring K[x]; here any ideal J is of the form I = (f(x)), where f(x) 1s a 
polynomial of smallest degree contained in IJ. In the ring K[x, y], it 1s easy to see 
that the ideal J of polynomials without constant term is not principal; it is of the 
form (x,y). An integral domain in which every ideal is principal is called a 
principal ideal domain (PID). 

It is not by chance that the rings Z and K [x] are unique factorisation domains: 
one can prove that any PID is a UFD. But the example of K[x, y] shows that 
there are more UFDs than PIDs. In exactly the same way, the ring 0, of functions 
of n > 1 complex variables which are holomorphic at the origin (§ 3, Example 7) 
is a UFD but not a PID. The study of ideals in this ring plays an important role 
in the study of local analytic varieties, defined in a neighbourhood of the origin 
by equations f, = 0,..., f,, = 0(with f; € ©,). The representation of such varieties 
as a union of irreducibles, the notion of their dimension, and so on, are based on 
properties of these ideals. 
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Example 7. In the ring @ of continuous functions on the interval, taking a 
function ¢ to its value @(x,) at x9 is a homomorphism with kernel the ideal 
I, = {9 € €|P(Xo) = O}. It is easy to see that J, is not principal: any function 
which tends to 0 substantially slower than a given function (x) as x > Xq (for 
example ./|@(x)| is not contained in the ideal (g(x)). One can prove in a similar 
way that /,, is not even generated by any finite number @,..., @,, € I,,, of func- 
tions in It. 

Another example of a similar nature can be obtained in the ring & of germs of 
C” functions at 0 on the line (by definition two functions defined the same germ 
at O if they are equal in some neighbourhood of 0). The ideal M, of germs of 
functions which vanish at 0 together with all of their derivatives of order <n is 
principal, equal to (x”"*'), but the ideal M,, of germs of functions all of whose 
derivatives vanish at 0 (such as e~'/*”) is not generated by any finite system of 
functions, as can be proved. In any case, the extent to which these examples carry 
conviction should not be exaggerated: it is more natural to use the topology of 
the ring @ of continuous functions, and consider ideals topologically generated 
by functions @,, ..., Q,,, that is, the closure of the ideal (g,,...,@,,). In this 
topological sense, any ideal of @ is generated by one function. The same con- 
siderations apply to the ring &, but its topology is defined in a more complicated 
way, and, for example, the fact that the ideal M,, is not generated by any finite 
system of functions then contains more genuine information. 

Let J and J be two ideals of a ring A. The ideal generated by the set of all 
products ij with ie I and j € J is called the product of I and J and denoted by 
IJ. Multiplication of principal ideals agrees with that of elements: if J = (a) and 
J = (b) then IJ = (ab). By analogy with the question of the unique factorisation 
of elements into prime factors, we can pose the question of factorising ideals of 
a ring as a product of ideals which cannot be factorised any further. Of course, 
both of these properties hold in a principal ideal domain. But there exist impor- 
tant types of ring which are not factorial, but in which the ideals have unique 
factorisation into products of irreducible factors. 


Example 8. Consider the ring of numbers of the form m + n/ —5withm,ne Z, 
given in §3, Example 12 as an example of a nonfactorial ring. The factorisation 


32 = (2 + ./—5)(2 — ./—5) (1) 


which we gave in §3 is not a factorisation into prime factors if we replace the 
numbers by the corresponding principal ideals. It is not hard to see that 


(2+ ./—5)=(24+ ./—5,3)%, 2—./—5)=(2—./—5,3)? 


and 


(3) =(24+ /—5,3)2 — ./ —S, 3), 


so that (1) is just the product (2 + ./ —5,3)?(2 — ./ —5, 3)? in which the factors 
are grouped in different ways. The possibility of an analogous factorisation is the 
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basis of the arithmetic of algebraic numbers. This is the historical explanation of 
the term ‘ideal’: the prime ideals into which an irreducible number factorises (for 
example 3 or 2 + ./ —5) were first considered as ‘ideal prime factors’. 

The numbers 3 and 2 + ./—5 do not have common factors other than +1, 
since they are irreducible. But the ideal (3,2 + ./ —5) is their greatest common 
divisor (more precisely, it is the g.c.d. of the ideals (3), (2 + ./ —5)). Similarly to 
the fact that the greatest common divisor of integers a and b can be expressed 
as au + bv, the ideal (3,2 + ./—5) consists of all numbers of the form 3a + 
(2+ ./—S)Bp. 

The notion of ideal is especially important because of the fact that the relation 
between homomorphisms and ideals is reversible: every ideal is the kernel of 
some homomorphism. In order to construct from an ideal J of a ring A the ring 
B which A will map to under this homomorphism, we introduce the following 
definitions. 

Elements a, and a, of a ring A are said to be congruent modulo an ideal I of 
A (or congruent mod I) if a, — a, € I. This is written as follows: 


a, =a,mod I. 


If A = Zand] = (n) then we obtain the classical notion of congruence in number 
theory: a, and a, are congruent modn if they have the same remainder on 
dividing by n. 

Congruence modulo an ideal is an equivalence relation, and it decomposes A 
as a union of disjoint classes of elements congruent to one another mod I. These 
classes are also called residue classes modulo I. 

Let I, and I, be two residue classes mod I. It is easy to see that however we 
choose elements a, € J, and a, € I, the sum a, + a, will belong to the same 
residue class I. This class is called the sum of J, and /,. In a similar way we 
define the product of residue classes. It is not hard to see that the set of all residue 
classes modulo an ideal J with the above definition of addition and multiplication 
forms a commutative ring; this is called the residue class ring or the quotient ring 
of A modulo IJ, and denoted by A/I. 

For example if A = Z and I = (2) then J has 2 residue classes, the even and 
odd numbers; and the ring Z/(2) coincides with the field F.,. 

It 1s easy to see that taking an element a € A into its residue class mod I is a 
homomorphism f: A > A/I, with kernel I. This is called the canonical homo- 
morphism of a quotient ring. 

Canonical homomorphisms of rings to their quotient rings give a more explicit 
description of arbitrary homomorphisms. Namely, the following assertion is easy 
to verify: 


Theorem I. For any ring homomorphism oy: A — B, the image ring Imo is 
isomorphic to the quotient ring A/Kerq, and the isomorphism o between them 
can be chosen so as to take the canonical homomorphism w: A — A/Ker @ into 
og: A> Img@. 
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More precisely, for any a eé A, the isomorphism o takes w(a) into (a) (recall 
that a(W(a)) € Im@ c B, so that o((a)) and ¢(a) are both elements of B). 

This result is most often applied in the case Im g = B. In this case the assertion 
is the following. 


II. Homomorphisms Theorem. 4A homomorphic image is isomorphic to the 
quotient ring modulo the kernel of the homomorphism. 


Under the canonical homomorphism f, the inverse image f~'(J) of any ideal 
J c A/Iis an ideal of A containing J, and the image f(J’) of any ideal J’ containing 
I is an ideal of A/I. This establishes a 1-to-1 correspondence between ideals of 
the quotient ring A/J and ideals of A containing I. 

In particular, as we know, A/I is a field if and only if it has exactly two ideals, 
(0) and (1), and this means that J is not contained in any bigger ideal other that 
A itself. Such an I is called a maximal ideal. It can be proved (using Zorn’s lemma 
from set theory) that any ideal I # A is contained in at least one maximal ideal. 

Together with the construction of fields of fractions, considering quotient rings 
modulo maximal ideals is the most important method of constructing fields. We 
now show how to use this to obtain a series of new examples of fields. 


Example 9. In Z, maximal ideals are obviously of the form (p), where p is a 
prime number. Thus Z/(p) is a field; it has p elements, and is denoted by F,. Up 
to now we have only constructed fields F, and F, with 2 or 3 elements. If nis not 
prime, then the ring Z/(n) is not a field, and as one sees easily, is not even an 
integral domain. 


Example 10. Consider now the polynomial ring K [x]; its maximal ideals are 
of the form (g(x)) with (x) an irreducible polynomial. In this case, the quotient 
ring L = K[x]/(@(x))1s a field. Write « for the image of x under the homomorphism 
K[x] ~ L = K[x]/(g(x)). Then for tautological reasons, g(«) = 0, so that the 
polynomial ¢ has a root in L. Write n for the degree of g. Using division with 
remainder, we can represent any polynomial u(x) € K[x] in a unique way in the 
form u(x) = p(x)W(x) + v(x), where v is a polynomial of degree less than n. It 
follows from this that any element of L can be uniquely expressed in the form 


Ag ta,a+a,a7 +--+ 4,,0"", (2) 


where dp, ..., @,—-, are arbitrary elements of K. 

If K =R and g(x) = x* + 1 then we construct in this way the field C of 
complex numbers; here i is the image of x in R[x]/(x? + 1), and a + bi is the 
image of a + bx. 

The above construction gives an extension field L/K in which a given poly- 
nomial g(t) has a root. Iterating this process, it can be proved that for any field 
K, there exists an extension 2/K such that any polynomial g € 2 [t] has a root 
in 2. A field having this property is said to be algebraically closed. For example, 
C is algebraically closed. 
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Let K be a field with p elements. If @ is an irreducible polynomial of degree n 
over K then the expression (2) shows that L has p” elements. Based on these ideas, 
one can prove the following results, which together describe all finite fields. 


Ill. Theorem on Finite Fields. 
(i) The number of elements of a finite field is of the form p", where p is the 
characteristic. 
(ii) For each p and n there exists a field ¥, with q = p" elements. 
(iti) Two finite fields with the same number of elements are isomorphic. 


Finite fields have very many applications. One of them, which specifically uses 
the fact that they are finite, relates to the theory of error-correcting codes. By 
definition, a code consists of a finite set E (an ‘alphabet’) and a subset U of the 
set E” of all possible sequences (a,,...,a,) with a; € E. This subset is to be chosen 
in such a way that any two sequences in U should differ at a sufficiently large 
number of places. Then when we transmit a ‘message’ (u,,...,u,) € U, we can still 
reconstruct the original message even if a small number of the u; are corrupted. 
A wealth of material for making such choices is provided by taking E to be some 
finite field F,, and U to be a subspace of the vector space F7. Furthermore, the 
greatest success has been achieved by taking Fy and U to be finite-dimensional 
subspaces of the field F,(¢) or even of F,(C), where C is an algebraic curve, and 
determining the choice of these subspaces by means of certain geometric condi- 
tions (such as considering functions with specified zeros and poles). Thus coding 
theory has turned out to be related to very delicate questions of algebraic geom- 
etry over finite fields. 

Considering already the simplest ring Z/(n) leads to interesting conclusions. 
Let K be an arbitrary field, with identity element 1. Consider the map f from Z 
to K defined by 


1+:::+1 = (ntimes) ifn >0 
f(in)=n-1, thatis f(n) = < 0 ifn =0 
—(1+:::+1) (—ntimes) ifn <0. 


It is easy to see that f is a homomorphism. Two cases are possible, either 
Ker f = Oor Ker f 4 0. 

In the first case f(Z) is a subring of K isomorphic to Z. Since K is a field, it 
must also contain the ratio of elements of this ring, which one easily checks form 
a subfield K, < K. It follows from the uniqueness of fields of fractions that Ko 
is isomorphic to Q, that is, K contains a subfield isomorphic to Q. 

In the second case, suppose that Ker f = (n). Obviously, n must be a prime 
number, since otherwise {(Z) = Z/n would not be integral. But then f(Z) = 
Z/(p) = F, is a field with p elements. 

Thus we have seen that any field K contains either the field © of rational 
numbers, or a field F, with some prime number of elements. These fields are called 
the prime fields; any field is an extension of one of these. If K contains a field 
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with p elements, then px = 0 for every x € K. In this case, p is called the char- 
acteristic of K, and we say that K is a field of finite characteristic, and write 
char K = p. If K contains Q then nx = 0 only if n = 0 or x = O; in this case, we 
say that K has characteristic 0, and write char K = O(or sometimes char K = 00). 

The fields Q, R, C, Q(x), R(x), C(x) are of characteristic 0. The field F,, with p 
elements has characteristic p, as have F,(x), F,(x, y) and so on. 

A ring A/I can be embedded in a field if and only if it is an integral domain. 
This means that I # A and if a, be A and abe! then either ae] or be I. We 
say that an ideal is prime if it satisfies this condition. For example, the principal 
ideal I = (F(x, y)) < K[x, y] is prime if F is an irreducible polynomial: the ring 
K[x, y]/I = K[[C] (where C is the algebraic curve with equation F(x, y) = 0) can 
be embedded in the field K(C). We can say that a prime ideal 1s the kernel of a 
homomorphism q: A — K, where K is a field (but possibly p(A) # K). 

It can be shown that the ideals of Example 8 which are irreducible (in the sense 
that they do not decompose as a product of factors) are exactly the prime ideals 
in the sense of the above definition. 

At the beginning of this section we discussed the point of view that any ring 
can be thought of as a ring of functions on some space X. The ‘points’ of the 
space correspond to homomorphisms of the ring into fields. Hence we can 
interpret them as maximal ideals (or in another version, prime ideals) of the ring. 
If M is an ideal ‘specifying a point x € X’ and ae A, then the ‘value’ a(x) of a at 
x is the residue class a + M in A/M. The resulting geometric intuition might at 
first seem to be rather fanciful. For example, in Z, maximal ideals correspond to 
prime numbers, and the value at each ‘point’ (p) is an element of the field F, 
corresponding to p (thus we should think of 1984 = 2°-31 as a function on the 
set of primes!, which vanishes at (2) and (31); we can even say that it has a zero 
of multiplicity 6 at (2) and of multiplicity 1 at (31)). However, this is nothing more 
than a logical extension of the analogy between the ring of integers Z and the 
polynomial ring K[t], under which prime numbers p e€ Z correspond to irre- 
ducible polynomials P(t) € K[t]. Continuing the analogy, the equation a(t) + 
a,(t)x +::: + ,(t)x” =0 with a,(t)e K[t] defining an algebraic function x(t) 
should be considered as analogous to the defining equation ay + a,x +°°' + 
a,x" = 0 with a; € Z of an algebraic number. In fact, in the study of algebraic 
numbers, it has turned out to be possible to apply the intuition of the theory of 
algebraic functions, and even of the Riemann surfaces associated with them. 
Several of the most beautiful achievements of number theory can be attributed 
to the systematic development of this point of view. 

Another version of the same ideas plays an important role in considering maps 
gy: Y > X (for example, analytic maps between complex analytic manifolds). If 
A is the ring of analytic functions on X and B that on Y, then as we said at the 
beginning of this section, a map g determines a homomorphism ¢*: A — B. Let 


‘ This example was chosen because of the year the book was written, and has nothing to do with the 
fiction of George Orwell (translator’s footnote). 


32 §4. Homomorphisms and Ideals 


Z< X be a submanifold and I c A the ideal of functions vanishing on Z. If 
I=(f,,...,f,), this means that Z is defined by the equations f, = 0,..., f. = 0. 
The inverse image @~'(Z) of Z in Y is defined by the equations o*f, = 0, ..., 
o*f, = 0, and it is natural to associate with it the ring B/(*f,,...,o*f,) = 
B/(g@*1I)B. Suppose for example that g is the map of a line Y to a line X given b 
x = y*. If Z is the point x = a # 0 then g‘(Z) consists of two points y = + Sa 
and 


Bg*)B = CLy]y? — a) = CLyly — /a) @CLy]y + Jae) = COC; 


that is, it is in fact the ring of functions on a pair of points. But if Z is the point 
x = 0 then g 4(Z) is the single point y = 0, and B/(g*I)B = C[y]/y?. This ring 
consists of elements of the form a + Be, with a, B e C, and « the image of y, with 
e? = 0; it can be interpreted as the ‘ring of functions on a double point’, and it 
gives much more precise information on the behaviour of the map x = y? ina 
neighbourhood of x = 0 than just the set-theoretic inverse image of this point. 
In the same way, the study of singularities of analytic maps leads to considering 
much more complicated commutative rings as invariants of these singularities. 


Example 11. Let K,, K,,..., K,, ... be an infinite sequence of fields. Consider 
all possible infinite sequences (a,,a,,...,a,,...) with a; € K;, and define opera- 
tions on them by 


(A,,€5,...,Qn,-.-) + (b,,b,,...,b,,...) = (a, + b,,a, + b5,...5An + b,, +++) 


and 
(Q,,@z,...,A,,---)(b,,D2,...,D,,---) = (a,b,,a,b5,...,a,D,,...). 


We thus obtain a commutative ring called the product of the fields K;, and 
denoted | | K;. 

Certain homomorphisms of the ring | | K; into fields (and hence, its maximal 
ideals) are immediately visible: we take the sequence (a,,a,,...,a,,...) into its 
nth component a, (for fixed n). But there are also less trivial homomorphisms. 
In fact, consider all the sequences with only finitely many nonzero components 
a;; these form an ideal I. Every ideal is contained in a maximal ideal, so let 4 
be some maximal ideal of |] K; containing I. This is distinct from the kernels of 
the above trivial homomorphisms, since these do not contain I. The quotient 
ring | | K;/-@ is a field, and is called an ultraproduct of the fields K;. We obtain 
an interesting ‘mixture’ of the fields K;; for example, if all the K; have different 
finite characteristics, then their ultraproduct is of characteristic 0. This is one 
method of passing from fields of finite characteristic to fields of characteristic 0, 
and using it allows us to prove certain hard theorems of number theory. 

If all the fields K; coincide with the field R of real numbers, then their 
ultraproduct has applications in analysis. It lies at the basis of so-called non- 
standard analysis, which allows us, for example, to avoid hard estimates and 
verifications of convergence in certain questions of the theory of differential 
equations. 
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From the point of view of mathematical logic, ultraproducts are interesting in 
that any ‘elementary’ statement which is true in all the fields K; remains true in 
their ultraproduct. 


i 
dz’ 
where the f,(z) are Laurent series (either convergent or formal). Multiplication 


of such operators is not necessarily commutative; but for certain pairs of opera- 
tors Z and A it may nevertheless happen that 4 = 4Q; for example, if 


Example 12. Consider differential operators of the form 9 = y f(z 


2 3 

QG= - —2z°>* and A= - = 3g 2 + 3z77, 
Then the set of all polynomial P(Z, 4) in J and 4 with constant coefficients is a 
commutative ring, denoted by Rg ,. Now something quite unexpected happens: 
if ZA = AQ then there exists a nonzero polynomial F(x, y) with constant coeffi- 
cients such that F(Z, 4) = 0, that is, and A satisfy a polynomial relation. For 
example, if 

2 3 

Q= o~2 and gat, 32H 4329 
then F = 9° — A*; we can assume that F is irreducible. Then the ring Rg , is 
isomorphic to C[x, y]/(F(x, y)), or in other words, to the ring C[C] where C is 
an irreducible curve with equation F(x, y) = 0. If the operators 9 and A have a 
common eigenfunction f, then this function will also be an eigenfunction for all 
operators of Rg ,. Taking any operator into its eigenvalue on the eigenfunction 
f isa homomorphism Rg , > C. In view of the isomorphism Rg , = C[C], this 
homomorphism defines a point of C. It can be shown that every point of the 
curve corresponds to a common eigenfunction of the operators 9 and 4. The 
relation between commuting differential operators and algebraic curves just 
described has in recent times allowed a significant clarification of the structure 
of commuting rings of operators. 
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Consider some domain V in space and the vector fields defined on it. These can 
be added and multiplied by numbers, carrying out these operations on vectors 
applied to one point. Thus all vector fields form an infinite-dimensional vector 
space. But in addition to this, they can be multiplied by functions. This operation 
is very useful, since every vector field can be written in the form 

0 0 


7) 
A— 4+ B--4+C02 
ax? ay + aa” 


34 §5. Modules 


where A, B and C are functions; hence it is natural to consider the set of vector 
fields as being 3-dimensional over the ring of functions. We thus arrive at the 
notion of a module over a ring (in this section, we only deal in commutative rings). 
This differs from a vector space only in that for a module, multiplication of its 
elements by ring elements is defined, rather than by field elements as for a vector 
space. The remaining axioms, both those for the addition of elements, and for 
multiplication by ring elements, are exactly as before, and we will not repeat 
them. 


Example 1. A ring is a module over itself; this is an analogue of a 1-dimensional 
vector space. 


Example 2. Differential forms of a given degree on a differentiable (or real or 
complex analytic) manifold form a module over the ring of differentiable (or real 
or complex analytic) functions on the manifold. The same holds for vector fields, 
and quite generally for tensor fields of a fixed type. (We will discuss the definition 
of all these notions in more detail later in §5 and in § 7). 


Example 3. If @ is a linear transformation of a vector space L over a field K, 
then we can make L into a module over the ring K[t] by setting 


f()x = (f(@))(x) for f()e K[t] and xeL. 


Example 4. The ring of linear differential operators with constant coefficients 
(§ 3, Example 3) acts on the space of functions (C“, of compact support, exponen- 
tially decaying, polynomial), and makes each of these spaces into a module over 
this ring. Since this ring is isomorphic to the polynomial ring R[t,,...,t, | (§3, 
Example 3), each of the indicated spaces is a module over the polynomial ring. 
Of course, the same remains true if we replace the field R by C. 


Example 5. Let M and N be modules over a ring A. Consider the module 
consisting of pairs (m,n) for me M, ne N, with addition and multiplication by 
elements of A given by 


(m,n) + (m,,n,) =(mM+m,,n+n,) and a(m, n) = (am, an). 


This module is called the direct sum of M and N and is denoted by M @ N. The 
direct sum of any number of modules can be defined in the same way. The sum 
of n copies of the module A (Example 1) is denoted by A” and is called the free 
module of rank n. This is the most direct generalisation of an n-dimensional vector 
space; elements of A” are n-tuples of the form 


m= (d,,...,a,) With a,eEA. 


Ife; = (0,...,1,...,0) with 1 in the ith place then m = )'a;,e;, and this representa- 
tion is unique. 

It is sometimes also useful to consider algebraic analogues of infinite- 
dimensional vector spaces, the direct sum of a family 2 of modules isomorphic 
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to A. Elements of this sum are specified as sequences {a,},.5 with a,é A aso 
runs through 2, and a, # 0 for only a finite number of o € 2. With the elements 
e, defined as before, every element of the direct sum has a unique representation 
as a finite sum ) a,e,. The module we have constructed is a free module, and the 
{e,} a basis or a free family of generators of it. 


Example 6. In a module M over the ring Z the multiplication by a number 
n € Z is already determined once the addition 1s defined: 


ifn>0O then nx =x+---+x (ntimes) 


and ifn = —m with m > 0 then nx = —(mx). Thus M is just an Abelian group” 
written additively. 

We omit the definitions of isomorphism and submodule, which repeat word 
for word the definition of isomorphism and subspace for vector spaces. An iso- 
morphism of modules M and N 1s written M = N. 


Example 7. Any differential r-form on n-dimensional Euclidean space R” can 
be uniquely written in the form 


a; _; ax 


teeely> ty 


A+: A dX; , 


ij < +++ <i, 


where a,;,__; belongs to the ring A of functions on R" (differentiable, real analytic 
or complex analytic, see Example 2). Hence the module of differential forms is 


. . n n\. . . . 
isomorphic to Alr), where ( is the binomial coefficient. 
r 


Example 8. Consider the polynomial ring C[x,,...,x,] as a module M over 
itself (Example 1); on the other hand, consider it as a module over the ring of 
differential operators with constant coefficients (Example 4). Since this ring is 
isomorphic to the polynomial ring, we get a new module N over C[X,,...,X,]. 
These modules are not isomorphic; in fact for any m’ € N there exists a non-zero 
element ae C[x,,...,x,] such that am’ = 0(take ato be any differential operator 
of sufficiently high order). But since C[x,,..., x, ] is an integral domain, it follows 
that in M, am = 0 implies that a = 0 or m= 0. 

In a series of cases, Fourier transform establishes an isomorphism of modules 
Mand N over the ring A = C[x,,...,x, ], where M and N are modules consisting 
of functions, and A acts on M by multiplication, and on N via the isomorphism 


é 7) 
C[t,,...,t,] = C| —.,...,=— }. 
Ls tn E =| 


For example, this is the case if M = N is the space of C® functions F(x,,...,X,) 
for which 


?'We assume that the reader knows the definition of a group and of an Abelian group; these will be 
repeated in § 12. 
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Bit--+By 
a 0 


Xi... Xp 


nF 
" AxBt .. AxBn 


is bounded for all « > 0, B; > 0. 

Recalling the definition of §4, we can now say that an ideal of a ring A isa 
submodule of A, if A is considered as a module over itself (as in Example 1). Ideals 
which are distinct as subsets of A can be isomorphic as A-modules. For example, 
an ideal I of an integral domain A is isomorphic to A as an A-module if and only 
if it is principal (because if I = (i) then at ai is the required homomorphism; 
conversely, if g: A >I is an isomorphism of A-modules, and 1 is the identity 
element of A then g(1) = ie] implies that m(a) = o(al1) = ag(1) = ai, that is 
I = (i)). Hence the set of ideals of a ring which are non-isomorphic as modules 
is a measure of its failure to be a principal ideal domain. For example, in the ring 
Az=Z+Z Jd consisting of numbers of the form a + b./d with a, b € Z (where 
dis some integer), there are only a finite number of non-isomorphic ideals. This 
number is called the class number of A, and is a basic arithmetic invariant. 


Example 9. Let {m,} be a set of elements of a module M over a ring A. Consider 
all possible linear combinations )' a;m,, with coefficients a, A (even if the set 
{m, } is infinite, each linear combination only involves finitely many terms). These 
form a submodule of the module M, called the submodule generated by the {m, }. 
In particular, if M = A as a module over itself, we arrive back at the notion of 
the ideal generated by elements {m,} which we have already met. If the system 
{m,} generates the whole of M, it is called a system of generators of M. 

The notion of a linear map of one vector space to another carries over 
word-for-word to modules; in this case such a map is called an A-linear map, or 
a homomorphism. Exactly as for the case of an ideal in a ring, for a submodule 
N <M we can define its cosets m+ N, the quotient module M/N and the 
canonical homomorphism M — M/N. The notions of image and kernel, and the 
relation between homomorphisms and submodules formulated in § 4 for the case 
of rings and ideals also carry over. 

These notions allow us to define certain important constructions. By defini- 
tion, we know how to add elements of a module M and multiply them by elements 
of A, but we don’t know how to multiply two elements together. However, in 
some situations there arises an operation of multiplying elements of a module 
M by elements of a module N, and getting a value in some third module L For 


example, if M consists of vector fields ) fice and N of differential 1-forms 


> p, dx; then the product )' f;p; is defined, and belongs to the ring of functions 
(and is independent of the choice of coordinates x,,..., x,,). In a similar way, one 
can define (independently of the choice of coordinates) a product of a vector field 
by a differential r-form, the result of which is a differential (r — 1)-form. 

We define a multiplication defined on two modules M and N and with values 
in a third module L to be a map which takes a pair of elements x € M, y € N into 
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an element xy € L, having the following bilinearity properties: 
(x, +x,)y=x,y+x,y for x,,x,e€MandyeN; 
x(y, ty.) =xy, + xy, for xeMandy,,y,EN; 
(ax)y = x(ay) = a(xy) for xe M,yeN andaeaA. 


If a multiplication xy is defined on two modules M and N with values in L, 
and if g: L—> L’ is a homomorphism, then @(xy) defines a multiplication with 
values in L’. It turns out that all possible multiplications on given modules M 
and N can be obtained in this way from a single ‘universal’ one. This has values 
in a module which we denote by M @, N, and the product of elements x and 
y is also denoted by x ® y. The universality consists of the fact that for any 
multiplication xy defined on M and N with values in L, there exists a unique 
homomorphism 


og: M®,N->L forwhich xy = o(x® y). 


It is easy to show that if a module and a product with this universality property 
exist, then they are defined uniquely up to isomorphism. The construction of the 
module M @, N and the multiplication x ®@ y is as follows: suppose that M has 
a finite set of generators x,,..., X,, and N a set y,,..., y,- We consider symbols 
(x;, y;), and the free module S = A” with these as generators. In S, consider the 
elements 

\ ax;,y;) for which } a,x; =0in M, 


and the elements 


J 


and consider the submodule S, generated by these elements. We set 
M WD N = S/So 
and if x = ) a,x; and y = ) by, then 
xOQy= » a;b(x; ® y;), 
LJ 

where x; © y; denotes the image in S/Sp of (x;, y;) under the canonical homomor- 
phism S — S/S,. It is easy to check that x ® y does not depend on the choice of 
the expressions of x and y in terms of generators, and that in this way we actually 
get a universal object. More intrinsically, and without requiring that M and N 
have finite systems of generators, we could construct the module M @,N by 


taking as generators of S all possible pairs (x, y) with x e M and ye N, and S, 
to be the submodule generated by the elements 


(X1 + X2,y) — (X1,¥) — (X2,y),  (Y1 + V2) — (% ¥1) — (% V2), 
a(x, y) — (x,ay), a(x, y) — (ax, y). 
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This way, we have to use a free module S on an infinite set of generators, even if 
we are dealing with modules M and N having finitely many generators. However, 
there is nothing arbitrary in the construction related to the choice of systems of 
generators. 

The module M @, N defined in this way is called the tensor product of the 
modules M and N, and x @ y the tensor product of the elements x and y. If M 
and N are finite-dimensional vector spaces over a field K, then M @, N is also 
a vector space, and 


dim(M ®, N) = dim M-dim N. 


If M is a module over the ring Z, then M @ 7 @ is a vector space over Q; for 
example if M = Z” then M ®7Q = Q". But if M = Z/(n) then M @7 QO = 0, that 
is, M is killed off on passing to M @ 7 Q; although any element me M corre- 
sponds to m@® 1 in M @7 Q, this is 0, as one checks easily from the bilinearity 
conditions. In a similar way, from a module M over an integral domain A we 
can get a vector space M @, K over its field of fractions K. In exactly the same 
way, a vector space E over a field K defines a vector space E ®, L over any 
extension L of K. When K = Rand L = C this is the operation of complexifica- 
tion which is very useful in linear algebra (for example, in the study of linear 
transformations). 

If M; is a vector space of functions f(x;) of a variable x; (for example, the 
polynomials f(x;) of degree <k;), then M, ®-::@® M, consists of linear com- 
binations of functions 


fi(%1)...f,(,) with fre M; 


in the space of functions of x,,..., x,. In particular, the ‘degenerate kernels’ of 
the theory of integral equations are of this form. It is natural quite generally to 
try to interpret spaces of functions (of one kind or another) K(x, y) of variables 
x, y as tensor products of spaces of functions of x and of y. This is how the 
analogues of the notion of tensor products arise in the framework of Banach and 
topological vector spaces. The classical functions K(x, y) arise as kernels of 
integral operators 


fire | K(x, y)f(y) dy. 


In the general case the elements of tensor products are also used for specifying 
operators of Fredholm type. A similar role is played by tensor products in quan- 
tum mechanics. If spaces M, and M, are state spaces of quantum-mechanical 
systems S, and S, then M, ® M, describe the state of the system composed of 
S, and S,. 


Example 10. The module M @,°:-- ®, M (r factors) is denoted by T’(M). If M 
is a finite-dimensional vector space over K, then T’(M) is the space of contra- 
variant tensors of degree r. 
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Example 11. The quotient module of M @, M by the submodule generated by 
the elements x ® y — y @® x for x, y Ee M is called the symmetric square of M, and 
is denoted by S*.M; it is universal for commutative multiplication xy of x, ye M. 
In a similar way we can define the rth symmetric power S’M; this is the quotient 
module of T’(M) by the submodule generated by all possible elements 


X,HB'°'' OX; @ X41, O° WX, — X, O's WO X41 OX, B''' @ x, 


fori = 1,...,r — 1, where x; € M. For example, if M is the module of linear forms 
in variables t,,..., t,, with coefficients in the field K, then S’M consists of all forms 
(that is, homogeneous polynomials) of degree r in t,,..., t,. 

Obviously, a product ofr elements x,,..., x, € M with values in S’M is always 
defined, and does not depend on the order of the factors: just consider the 
image of x, ® --- ® x, under the canonical homomorphism T’(M) — S'M. These 
products generate S’M. 


Example 12. The rth exterior power of a module M is the quotient module of 
T’(M) by the submodule generated by expressions x, ®::: ® x, in which two 
factors coincide, say x; = x;. The exterior power is denoted by /\’ M. For exam- 
ple, the module of differential r-forms on a differential manifold is isomorphic to 
/\ M, where M is the module of differential 1-forms. By analogy with the case 
of the symmetric power, the multiplication of r elements x,, ..., x, of M with 
values in /\M is defined; it is denoted by x, A--: A x,, and is called their 
exterior product. By definition, x, A °*: A x, = Oif x; = x;. It follows easily from 
this that x; A° AX; A Xiu, ATA X, = KXAN TON XL AX ATCA X,. If 
M has a finite number of generators x,,..., x, then the products 


Xi, AT AX, for 1<iy<i,<-*'<i,<n 
are generators for /\'M. In particular, (\’M =0 for r>n. If M is an n- 


. . . n 
dimensional vector space over a field K, then dim /’ M = ( forr <n. 
r 


Example 13. If M is a module over a ring A then the set M* of all homomor- 
phisms of M to A is a module, if we define operations by 


(f + g)(m) = f(m)+ g(m) for f,geM* andme M; 
(af)(m) = af(m), for feM*,ae Aandme M. 


This module is called the dual module of M. If M is a vector space over a field 
K, then M* is the dual vector space. The space of differential 1-forms on a 
differentiable manifold (as a module over the ring of differentiable functions) is 
the dual of the module of vector fields. 

The elements of the space T’(M*) are called covariant tensors; the elements of 
T?(M) ® T4(M*) are called tensors of type (p, q). 

If M is the space of tensors of type (p,q) over a vector space and N the space 
of tensors (p’,q’), then M @ N is the space of tensors of type (p + p’,g + q’), and 
® is the operation of multiplying tensors. 
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In conclusion, we attempt to extend to modules the ‘functional’ intuition which 
we discussed in § 4 as applied to rings. We start with an example. 

Let X be a differentiable manifold, A the ring of differentiable functions on it, 
and M the A-module of vector fields on X. At a given point x € X, every vector 
field t takes a value t(x), that is, there is defined a map M — T, where T, 1s the 
tangent space to X at x. This map can be described in algebraic terms, by defining 
multiplication of constants « € R by a function fe A by f:a = f(x)a. Then R 
will be a module over A, and T, = M @,R, and our map takes t into the element 
t ® 1. In this form, we can construct this map for an arbitrary module M over 
an arbitrary ring A. Let g: A— K be a homomorphism of A into a field with 
(A) = K and kernel the maximal ideal m; then K is a module over A if we set 
aa = ~(a)a for ae A and «e K. Hence there is defined a vector space M,, = 
M ®, K over K, the ‘value of M at the point m’. For example, if A = K[C], where 
C 1s an algebraic curve (or any algebraic variety), then as we saw in § 4, any point 
c € C defines a homomorphism 9,: A — K, where g,( f) = f(c), and the maximal 
ideal m, consisting of functions f € A with f(c) = 0. 

Thus each module M over K[C] defines a family of vector spaces M,, ‘pa- 
rametrised by’ the variety C, and in an entirely similar way, a module M over an 
arbitrary ring defines a family of vector spaces M @,(A/m) over the various 
residue fields A/m, ‘parametrised by’ the set of maximal ideals m of A. 

The geometrical analogue of this situation is the following: a family of vector 
spaces over a topological space X is a topological space & with a continuous map 


f:€ ~X, 


in which every fibre f~'(x) is given a vector space structure (over R or C), 
compatible with the topology of & in the natural sense. A homomorphism between 
families f: & + X and g: ¥ > X is a continuous map 

0:6 oF, 


taking each fibre f~!(x) into the fibre g~'(x), and inducing a linear map between 
them. A family & of vector spaces defines a module Mg, over the ring A(X) of 
continuous functions on X. If the family @ is a generalisation of a vector space, 
then an element of M, is a generalisation of a vector: it is a choice of a vector in 
each fibre f~'(x) for x e X. More precisely, elements of Mz, called sections, are 
defined as continuous maps 


si: X > 6, 
for which the point s(x) belongs to the fibre f~*(x), for all x € X (that is, fs(x) = x). 
The operations 
(s, + S>)(x) = s,(x) + s(x) for s,,s,€M,gandxeX; 


(ps)(x) = p(x)s(x), for me A(X), xe X andse Me, 


make Mg into a module over A(X). 
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The basic invariant of a vector space is its dimension, and in this context the 
class of finite-dimensional vector spaces is distinguished. For modules, which are 
a direct generalisation of vector spaces, there are analogous notions, which play 
the same fundamental role. On the other hand, we have considered algebraic 
curves, surfaces, and so on, and have ‘coordinatised’ each such object C by 
assigning to it the coordinate ring K[C] or the rational function field K(C). The 
intuitive notion of dimension (1 for an algebraic curve, 2 for a surface, and so on) 
is reflected in algebraic properties of the ring K[‘C] or of the field K(C), and these 
properties are meaningful and important for more general types of rings and 
fields. As one might expect, the situation becomes more complicated in com- 
parison with the simplest examples: we will see that there exist various ways 
of expressing the ‘dimension’ of rings or modules as a number, and various 
analogues of finite dimensionality. 

The dimension of a vector space can be defined from various different starting 
points: firstly, as the maximal number of linearly independent vectors; secondly, 
as the number of vectors in a basis (and here we need to prove that all bases 
of the same vector space consist of the same number of vectors); finally, one 
can make use of the fact that if the dimension is already defined, then an 
n-dimensional space L contains an (n — 1)-dimensional subspace L,, and L, an 
(n — 2)-dimensional subspace L,, and so on. We thus get a chain 


LZL,7L1, 7° 7L,=0. 


Hence the dimension can be defined as the greatest length of such a chain. 
Each of these definitions applies to modules, but here we already get different 
properties, which provide different numerical characteristics of modules; they 
also lead to different analogues of finite dimensionality for modules. We will 
consider all three of these approaches. For the first of these we assume that A is 
an integral domain. 

Elements m,,...,m, of a module M over a ring A are linearly dependent if there 
exist elements a,,..., a, € A, not all zero, with 


a,;m, +--+ a,m, = 0; 


otherwise they are linearly independent. The maximal number of linearly indepen- 
dent elements of a module M is called its rank, rank M; if this is finite, then M 
is a module of finite rank. The ring A itself is of rank 1 as an A-module, and the 
free module A” has rank n in the new definition. 

Despite the apparent similarity, the notion of rank is in substance very far 
from the dimension of a vector space. Even if the rank n is finite and m,,..., m, 
is a maximal set of linearly independent elements of a module, then it is quite 
false that every element m can be expressed in terms of them: in a linear depen- 
dence relation am + a,m, + --- + a,m, = 0, we cannot in general divide through 
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by a. Thus we do not get the same kind of canonical description of all elements 
of the module as that provided by the basis of a vector space. Moreover, one 
might think that modules of rank 0, being analogues of 0-dimensional vector 
spaces, should be in some way quite trivial, whereas they can be arbitrarily 
complicated. Indeed, a single element m € M is linearly dependent if there exists 
a nonzero element a € A such that am = 0; in this case we say that m is a torsion 
element. A module has rank 0 if it consists entirely of torsion elements; it is then 
called a torsion module. For example, any finite Abelian group considered as a 
Z-module is a torsion module. A vector space L with a linear transformation @ 
considered as a module over the polynomial ring K[x] (§5, Example 3) is also 
a torsion module: there exists a polynomial f(x) # 0 such that f(g) = 0, (that is 
(f(@))(x) = O or f- x = 0) for every x € L. The polynomial ring R[x,,...,x, ] as 
7) 
0x,’ Ox, 
4) is another example of a torsion module. All of these modules have rank 0, 
although, for example, it is intuitively hard to accept the last example as being 
even finite-dimensional. 

A better approximation to an intuitive notion of finite dimensionality is pro- 
vided by the definition of finite dimensionality of a vector space in terms of the 
existence of a basis. 

A module M having a finite set of generators is said to be finitely generated, 
or a module of finite type. Thus M contains a finite system m,,..:,m, of elements 
such that any element is a linear combination of these, although in contrast to 
vector spaces, we cannot require that this representation is unique. 

A ring as a module over itself, and more generally a free module of finite 
rank, is of finite type, as is a finite Abelian group as a Z-module and a vector 
space with a given linear transformation as a K[x]-module. The polynomial 
ring R[x,,...,x,] 1s not of finite type as a module over the ring of differen- 


a module over the ring of differential operators R| (§5, Example 


; 0 0 , ; 
tial operators R oa} starting from a finite number of polynomials 
xy Xn 


F,,..., F,, it is not possible to get polynomials of higher degree by applying 
differentiations. 

A homomorphic image of a module of finite type has the same property: the 
image of a system of generators is a system of generators. In particular, homo- 
morphic images of the free module A” are all of finite type and are generated by 
at most n elements. The converse is also true. If M has generators m,, ..., m, 
then taking a k-tuple (a,,...,a,) € A* (by definition A* consists of such k-tuples) 
into the element a,m, + °:: + a,m, is a homomorphism with image M. This 
proves the following: 

Theorem I. Any module of finite type is a homomorphic image of a free module 
of finite type A”. 


In particular, a module with a single generator is a homomorphic image of 
the ring A itself, that is (by the homomorphisms theorem) is of the form A/I, 
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where I is an ideal of A; if J =0 then M is isomorphic to A. A module of 
this form is called a cyclic module. We can think of these as analogues of 
1-dimensional vector spaces. 

In some cases, modules of finite type are rather close to finite-dimensional 
vector spaces. For example, if A is an integral domain in which all the ideals are 
principal (that is, a PID), then we have the following result. 


II. Theorem on Modules over a Principal Ideal Domain. A module of finite 
type over a PID is isomorphic to a direct sum of a finite number of cyclic modules. 
A cyclic module is either isomorphic to A or decomposes further as a direct sum 
of cyclic modules of the form A/(x*) where x is a prime element. The representation 
of a module as a direct sum of such modules is unique. 


If a module M is a torsion module then there are no summands isomorphic 
to A. This happens for example if A = Z and M isa finite Abelian group. In this 
case the theorem we have stated gives a classification of finite Abelian groups. 
The same holds if A = C[x], and M = L is a finite-dimensional vector space 
over C with a given linear transformation (§ 5, Example 3). In this case it is easy 
to see that our theorem gives the reduction of a linear transformation to Jordan 
normal form. 

One proof of Theorem II is based on a representation of M in the form 


M=A"/N with Nc A" 
(by Theorem J). It is easy to prove that N is also a module of finite type. If 
A" = Ae, ®:::@®Ae, and N=(uy,...,u,), then u; = > c,e;, 


and the representation M = A"/N shows that M is ‘defined by the system of linear 
equations’ 

>, ce; =0 for i=1,...,m. 

j=1 


We now apply to this system the idea of Gauss’ method from the classical theory 
of systems of linear equations. 


Main Lemma. Over a PID, any matrix can be reduced to diagonal form by 
multiplying on either side by unimodular matrixes. 


If the analogue of the Euclidean algorithm holds in the ring then multiplication 
on either side by unimodular matrixes can be performed by the well-known 
elementary transformations (row and column operations): interchanging two 
rows, adding a multiple of one row to another, and similar operations on 
columns. Applied to the matrix (c;;), row and column operations correspond to 
the simplest possible transformations of the systems of generators e,,..., e, and 
relations u,,..., u,,. In this case, the analogy with Gauss’ method is particularly 
obvious. 
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The main lemma allows us to find systems of generators for which the matrix 
(c,;) is diagonal. If 


(cy) = 0 9 with a, #0,...,a, #0 


0 0 
then M = A"/N = A/(a,) ®::: ® A/(a,) ® A”’”. From this it is not hard to get 
to the assertion of Theorem II. 

In particular if A = Z, Theorem II describes the structure of Abelian groups 
with a finite number of generators. Such groups arise, for example, in topo- 
logy as the homology or cohomology groups of a finite complex (see §21 for 
these). 

However, one property, which intuitively is closely related to finite dimen- 
sionality, does not hold in general for a module of finite type: a submodule may 
no longer be of finite type. This can fail even in the simplest case: a submodule 
of a ring A, that is, an ideal, is not always of finite type. For example in the ring 
& of germs of C® functions at 0 € R, the ideal of functions vanishing at 0 together 
with all derivatives does not have a finite number of generators (§ 4, Example 7). 
In the same way, in the polynomial ring in an infinite number of generators x,, 
X5,..., X_,-.. (each polynomial depends of course only on finitely many of them) 
the polynomials with no constant term form an ideal which does not have a finite 
number of generators. Thus it is natural to strengthen the finite dimensionality 
condition, by considering modules all of whose submodules are of finite type. 
We say that a module with this property is Noetherian. This notion can be related 
to the so far unused characterisation of the dimension of a vector space in terms 
of chains of subspaces. Namely, the Noetherian condition is equivalent to the 
following property of a module (called the ascending chain condition or a.c.c.): 
any sequence of submodules 


M, + M,7°°°=+M, °°, 


is finite. The verification of this equivalence is almost obvious. 

These ideas can also be applied to the classification of rings from the point of 
view of analogues of finite dimensionality. It is natural to consider rings over 
which any module of finite type is Noetherian; a ring with this property 1s a 
Noetherian ring. For this, it is necessary first of all that the ring should be 
Noetherian as a module over itself, that is, that every ideal should have a finite 
system of generators. But it is not hard to check that this is also sufficient: if 
all ideals of a ring A have a finite basis then the free modules A” are also 
Noetherian, and hence also their homomorphic images, that is, all modules of 


finite type. 
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How wide is the notion of a Noetherian ring? Obviously any ring all of whose 
ideals are principal is Noetherian. Another fundamental fact is the following 
theorem: 


III. The Hilbert Basis Theorem. For a Noetherian ring A the polynomial ring 
A[x] is again Noetherian. 

The proof is based on considering the ideals J, < A (forn = 1,2,...), consisting 
of elements which are coefficients of leading terms of polynomials of degree n 
contained in a given ideal I < A[x], and then making repeated use of the 
Noetherian property of A. It follows from the Hilbert basis theorem that the 
polynomial ring A[x,,...,x, ] in any number of variables is Noetherian if A is. 
In particular, the ring K[x,,...,x,] 1s Noetherian. It was for this purpose that 
Hilbert proved this theorem; he formulated it in the following explicit form. 


Theorem. Given any set {F,} of polynomials in K[x,,...,X,], there exists a 
finite subset F,,,..., F,,. such that any polynomial F, can be expressed as a linear 
combination 


PLFA to + PaFa, with P,,---5 Pm € KLX1,.--,Xq]- 


But we can go even further. Obviously, if A is Noetherian then the same is true 
of any homomorphic image B of A. We say that a ring R containing a subring 
A is finitely generated over A, or is a ring of finite type over A if there exists a 


finite system of elementsr,,...,r, of R such that all elements of R can be expressed 
in terms of them as polynomials with coefficients in A; the elements r,,..., r,, are 
called generators of R over A. Consider the polynomial ring A[x,,...,x,] and 
the map 


F(X 4,.-.5X,) HO F(ry,..-5 1%): 


This is a homomorphism, and its image is R. Thus we have the result: 


Theorem IV. Any ring of finite type over a ring A is a homomorphic image of 
the polynomial ring A[x,,...,X,,]. From the above it then follows that a ring of 
finite type over a Noetherian ring is Noetherian. 


For example, the coordinate ring K[[C] of an algebraic curve C (or surface, or 
an algebraic variety) is Noetherian. If C is given by an equation F(x, y) = 0 then 
x and y are generators of K[C] over K. 

Other examples of Noetherian rings which are important in applications are 
the rings ©, of functions of n complex variables which are holomorphic at the 
origin, and the formal power series ring K[t,,...,t, ]. 

Noetherian rings are the most natural candidates for the role of finite- 
dimensional rings. A notion of dimension can also be defined for these, but this 
would require a rather more precise treatment. 

While the condition that a ring should be a ring of finite type over some simple 
ring (for example, over a field) is a concrete, effective form of a finite dimensionality 


46 §6. Algebraic Aspects of Dimension 


condition, the Noetherian condition is more intrinsic, although a weaker asser- 
tion. In one important case these notions coincide. 

A ring A is graded if it has specified subgroups A, (that is, submodules of A as 
a Z-module) for n = 0, 1,..., such that for x € A, and y € A,, we have xy € A,4m) 
and any element x € A can be uniquely represented in the form 


X=Xyp tx, +°:' +X, with x; € A;. (1) 


We say that elements x € A,, are homogeneous, and the representation (1) is the 
decomposition of x into homogeneous components. The subset Ag is obviously 
a subring of A. 

For example, the ring K[x,,...,X,,] is graded, with A, the space of homomo- 
geneous polynomials of degree nin x,,...,X,, and Ay = K. 

One checks easily the following result: 


Theorem V. Let A be a graded ring; then A is Noetherian if and only if Ag is 
Noetherian and A is a ring of finite type over Ag. 


Proof. Obviously, the set of elements x € A for which x, = 0 in (1) is an ideal 
I,. It turns out that for the truth of the assertion in the theorem, it is sufficient 
for just this single ideal to be finitely generated. Indeed, we take a set of generators 
of Ip, represent each generator in the form (1), and consider all the homogeneous 
terms x; appearing in this way. We get a set of homogeneous elements x,, ..., 
Xy (with x; € A,,) which again obviously generate I). These elements x,,..., Xy 
are generators for A over Ag. Indeed, it is enough to prove that any element 
x € A, with n > 0 can be expressed as a polynomial in x,,..., X,y with coefficients 
in A. By assumption I) = (x,,...,X,), and in particular 


X= 4X, +-°' + AyXn with a; € A. 


Considering the decomposition of the elements a; into homogeneous com- 
ponents, and noting that on the left-hand side x € A,, we can assume that a; € A, 
and x; € A,, withn; + m; = n. Forn; = n the component a,x; is expressed in terms 
of x; with coefficient a; € Ag as required, whereas for n; < n we can apply to a; 
the same argument as for x. After a finite number of steps we get the required 
expression for x. 

For fields, the intuitive notion of finite-dimensionality is realised by analogy 
with rings. We say that a field L is an extension of finite type of a subfield K if 
there exists a finite number of elements «,,..., «, € L such that all the remaining 
elements of L can be represented as rational functions of a,, ..., «, with coeffi- 
cients in K. In this case we write L = K(a,,...,@,,), and say that L is the extension 
of K generated by «,, ..., a,. For example, the field of rational functions 
K(x,,...,X,) iS an extension of K of finite type. The complex number field 1s 
an extension of finite type of the real number field: complex numbers can be 
represented as extremely simple rational functions a + bi of the single element i. 
Any finite field F, is a extension of finite type of its prime subfield: we could take 
O1,.-.,%, to be, for example, all the elements of F,. If C is an irreducible algebraic 
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curve, given by an equation F(x, y) = 0 then K(C) is an extension of finite type 
of K, since all the functions in K(C) are rational functions of the coordinates x 
and y. The same holds if C is an algebraic surface, and so on. 

These examples make it plausible that for extensions of finite type there exists 
an analogue of the notion of dimension, corresponding to the intuitive notion of 
dimension for algebraic curves, surfaces, and any algebraic varieties. 

A system of elements «,,..., «, of a field L is said to be algebraically dependent 
over a subfield K of L if there exists an irreducible polynomial F € K[x,,...,x,,], 
not identically zero, such that 


F(a,,...,%,) = 0. 


If «, actually occurs in this relation, we say that the element a, 1s algebraically 
dependent on a,,...,%,-,. Certain very simple properties of algebraic dependence 
are just the same as the well-known properties of linear dependence. For example 
if an element a is algebraically dependent on «,, ..., «, and each of the a; 1s 
algebraically dependent on elements f£,,..., B,,, then « is algebraically dependent 
on B,,.--, B,. From this, repeating formally the well-known arguments for the 
case of linear dependence, we can prove that in an extension of finite type there 
exists an upper bound for the number of algebraically independent elements. The 
maximal number of algebraically independent elements of an extension of finite 
type L/K is called the transcendence degree of the extension, and is denoted by 
trdeg L/K. 

If the transcendence degree of an extension L/K is n, then L contains a set of 
n algebraically independent elements such that any other element is algebraically 
dependent on them; conversely, if n elements with this property exist, then the 
transcendence degree equals n. 

For example, the transcendence degree of the rational function field K(x,,...,x,) 
as an extension of K 1s n. Let C be an irreducible algebraic curve, defined by an 
equation F(x, y) = 0. If for example y actually occurs in the equation F then in 
the field K(C), the element x is algebraically independent and y is algebraically 
dependent on x, and hence so are all other elements of K(C). Hence the transcen- 
dence degree of K(C)/K is 1. In the same way, one proves that if C is an algebraic 
surface then the transcendence degree of the field K(C) is 2. We thus arrive at a 
notion of dimension which really agrees with geometric intuition. The trans- 
cendence degree of the field K(C), where C is an algebraic variety, is called the 
dimension of C, and is denoted by dim C. It enjoys natural properties: for example, 


dmC, <dimcC, if C,cC,. 


Example 1. Let X be a compact complex analytic manifold of dimension n and 
MX) the field of all meromorphic functions on X. It can be proved that 


trdeg W@(X)/C <n. 
If X is an algebraic variety over C then 
M(X)=C(X) and trdeg “(X) =n. 
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Thus the number tr deg .4@(X)/C is a measure of how close the complex manifold 
X is to being an algebraic variety; all possible values from 0 to n occur already 
in the particular case of complex toruses (see § 15). 

What does an extension L/K of finite type and of transcendence degree 0 look 
like? To say that the transcendence degree is 0 means that any element a € L 
satisfies an equation F(a) = 0, where F is a polynomial. Such an element « is said 
to be algebraic over K. Since L/K is an extension of finite type, 


L= K(q,,...,@,) forcertain a,,...,4,¢€L. 


Thus L/K can be obtained as a composite of extensions of the form K(a)/K, 
where « is an algebraic element. Conversely, a composite of such extensions 
always has transcendence degree 0. 

Suppose that L = K(a) where a is an algebraic element over K. Among all 
polynomials F(x) € K[x] for which F(a) = 0 (these exist, since a is algebraic over 
K), there exists one of smallest degree; all others are divisible by this one: for 
otherwise, by division with remainder, we would arrive at a polynomial of smaller 
degree with the same property. This polynomial of smallest degree P is uniquely 
determined up to a constant multiple. It is called the minimal polynomial of «. 
Obviously, P is irreducible over K. Knowing the minimal polynomial P we can 
specify all the elements of the field L = K(a) in a very explicit form. For this, 
consider the homomorphism 

go: K[x]-~L 


which takes a polynomial F € K[x] into the element F(a) € L. The kernel of @ 
is the principal ideal (P), as one sees easily. Hence its image is isomorphic to 
K[x]/(P) (by the homomorphisms theorem). It is not hard to show that its image 
is the whole of L; for this we should note that Imq is a field and contains «. 
Hence L is isomorphic to K[x]/(P). If the degree of P is n then, as we saw in §4, 
Formula (2), every element of the field L = K[x]/(P) can be expressed in the form 


é = ao + a,a +e° + a,—,o"? with a; E K, (2) 


and the expression is unique. The classic example of this situation is K = R, 
L=C=R{[i], P(x) = x? + 1: every complex number can be represented as 
a+ biwitha, be R. 

The representation (2) for elements of the field L = K(a) leads to an important 
corollary. Suppose we forget about the multiplication in L and keep only addi- 
tion and multiplication by elements of K. Then (2) shows that the vector space 
L is finite-dimensional over K and the elements 1, a, ..., x”! form a basis of it. 
An extension L/K is finite if L 1s finite-dimensional as a vector space over K. Its 
dimension is called the degree of the extension L/K, and is denoted by [L: K]. In 
the previous example [L: K] = n;1n particular [C: R] = 2. 

For example, if F, is a finite field and p the characteristic of F,, then F, contains 
the prime field with p elements F,. Obviously, F,/F, is a finite extension. If 
[f, : F,] =n then there exist n elements «,,...,«,, € F, such that any other element 
can be uniquely represented in the form 
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A= aye,+''+a,0, with a,eF,, 


and it follows from this that the number of elements of a finite field F, is equal 
to p", that is, it is always a power of p. 

It is easy to prove that the condition that an extension be finite is transitive, 
that is, if L/K and A/L are finite extensions, then 4/K is also finite, and 


[A:K] =[A:L][L:K]. (3) 


It follows from the above that any extension of finite type and of transcendence 
degree 0 is finite. Conversely, if L/K is a finite extension and [L: K] = n then for 
any a € L the elements 1, a,..., ~” must be linearly dependent over K (since there 
are n+ 1 of them). It follows from this that « is algebraic, and hence L has 
transcendence degree 0. Thus we obtain another characterisation of extensions 
of finite type and of transcendence degree 0; these are the finite extensions. From 
what we have said above, any finite extension is obtained as a composite of 
extensions of the form K(«). But we have the following result: 


VI. Primitive Element Theorem. Suppose that K is a field of characteristic 0, 
and that L = K(a, B) is an extension generated by two algebraic elements « and B; 
then there exists an element y € L such that L = K(y). 

Under this condition, any finite extension L = K(a,,...,%,) can be expressed in 
the form L = K(a), so that L = K[x]/(P), and we have the representation (2) of 
the elements of L. 


In fact the result holds under much wider assumptions, and in particular for 
finite fields. 

If every polynomial has a root in a field K, that is, if K 1s algebraically closed, 
then all the irreducible polynomials are linear, and an extension of K cannot 
contain algebraic elements other than the elements of K. Hence K does not 
have any finite extensions other than K itself. This is the case for the complex 
number field C. The real number field has only two finite extensions, R and C. 
But the rational number field Q and the field K(t) of rational functions (even for 
K = C)/ have very many finite extensions. These are instruments for the study of 
algebraic numbers (in the case of Q) and of algebraic functions (in the case C(t)). 
It can be shown that any finite extension of K(t) is of the form K(C) where C is 
some algebraic curve, and a finite extension of the field K(x,,...,x,) 1s of the 
form K(V), where V is an algebraic variety (of dimension n). 

An extension K(a), where « is a root of an irreducible polynomial P(x), is 
determined by this polynomial, and so the theory of finite extensions is a certain 
language (and also a ‘philosophy’) in the theory of polynomials in one variable. 
In one and the same extension L/K there exist many elements « for which 
L = K(a), and many polynomials P(x) corresponding to these. The extension 
itself reflects those properties which all of these have in common. We have here 
another example of ‘coordinatisation’, analogous to assigning the function field 
K(C) to an algebraic curve C. The construction of a field K(a) in the form 
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K[x]/(P) is entirely parallel to the construction of the field K(C) from the 
equation of the curve C. 

The most elementary example illustrating applications of properties of exten- 
sions to concrete questions is the theory of ruler and compass constructions. 
Translating these constructions into the language of coordinates, it is easy to see 
that they lead either to addition, subtraction, multiplication and division opera- 
tions on the numbers representing intervals already constructed; or to solving 
quadratic equations, the coefficients of which are numbers of this type (to find 
the points of intersection of a line and a circle, or of two circles). Hence if we let 
K denote the extension of Q generated by all the quantities given in the statement 
of the problem, and « the numerical value of the quantity we are looking for, 
then the problem of constructing this quantity by ruler and compass reduces to 
the question of whether a is contained in an extension L/K which can be 
represented as a chain 


L/L,, L,/L, ree Ln-2/Lq-1; Ly-1/Ln = K, 


in which each extension is of the form L;_, = L,(B), where B satisfies a quadratic 
equation. This condition is equivalent to [L;_, : L;] = 2. Applying the relation 
(3) we obtain that [L: K] = 2”. Ifa e L then K(«) c L, and again it follows from 
(3) that the degree [K(a«): K] must be a power of 2. This is only a necessary 
condition; a sufficient condition for the solvability of a problem by ruler and 
compass can also be formulated in terms of the field K(«), but is slightly more 
complicated. However, already the necessary condition we have obtained proves, 
for example, that the problem of doubling the cube is not solvable by ruler and 
compass: it reduces to the construction of a root of the polynomial 


x3 —2, and [Q(3/2): Q] = 3. 


In exactly the same way, the problem of trisecting an angle leads, for example, 
to the construction of « = cos @/3, given that a = cos ¢ is known. This is related 
to the cubic equation 

403 —3a-—a=0. 


We should consider a as an independent variable, since ¢ is arbitrary. Hence K 
is the field of rational functions Q(a), and [K(a): K] = 3, and again the problem 
is not solvable by ruler and compass. 

In the same way, the question of solving algebraic equations by radicals also 
leads to certain questions on the structure of finite extensions. We will deal with 
this in detail in § 18.A. 
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Considering quantities ‘up to infinitesimals of order n’ can be translated in 
algebraic terms quite conveniently, considering elements ¢ (of certain rings) 
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satisfying «" = 0 as analogues of infinitesimals. Suppose, for example, that C is an 
algebraic curve, for simplicity considered over the complex field C. We introduce 
the commutative ring 


U = {a+ ayela, a, EC, &” = 0}. 


This can be described more precisely as C[x]/(x?), with « the image of x 
under the canonical homomorphism C[x]—- U. Consider homomorphisms 
gy: C[C] > U over C (that is, such that C < C[C] is mapped by the identity to 
C < U). Such a @ is determined by the images ~(x) and ~(y) of the coordinates 
x and y, since the other elements of C[C] are polynomials h(x, y) in x and y, and 
(h(x, y)) = h(@(x), e(y)). Also, if F(x, y) = 0 1s the equation of C then the ele- 
ments g(x) and ¢g(y) of U must satisfy the same equation 


F (g(x), e(y)) = 9. (1) 


We write v(x) = a + a,é, and o(y) = b + b,¢. The ring U has a standard homo- 
morphism w: U > C given by wW(a + a,e) = a. Applying this to the relation (1), 
we get F(a,b) = 0, that is, @ defines a point (a, b) e C. However, knowing this 
point, we can reconstruct only the terms a and b in the expressions for g(x) and 
o(y). What is the meaning of the coefficients a, and b,? We substitute the values 
for p(x) and ~(y)in (1) and write F(a + a,¢,b + b,€)in the standard form c + c,é. 
Expanding F as a Taylor series and using the fact that F(a, b) = 0 and ce? = 0 we 
see that F(a + a,¢,b + bye) = (a, F,(a, b) + 5, Fy(a, b))e, and condition (1) can be 
written 


F(a,b)=0 and a,F.(a,b) + b, F(a,b) =0. 


This means that (a, b) is a point of C and (a,,5,) is a vector lying on the tangent 
line to C at (a,b). Here we assume that (a, b) is not a singular point of C, that is, 
the partial derivatives F{(a,b) and F,(a, b) do not both vanish. It is easy to see 
that our arguments give a description of all homomorphisms of C[C] to U: these 
correspond to pairs consisting of a point of C and a vector of the tangent line to 
the curve at this point. In a similar way, for the case of an algebraic surface we 
get a description of the tangent planes, and so on. 

We formulate the previous arguments in a somewhat different way. We com- 
pose g: C[C]— U with the standard homomorphism wy: U —C, to get the 
sequence 


c(cjSu45e. 


As in §4, Example 2, the composite @ = Wo defines the point x, € C, taking a 
function into its value at x». Hence the kernel is the maximal ideal M,,, of CLC], 
consisting of functions vanishing at xy. If x) = (a,b) then x — aand y — b belong 
to M,. This corresponds to the fact that p(x — a) and @(y — b) are of the form 
a,eé and b,e, that is, they belong to the ideal J = Kerw of U. A vector of the 
tangent space at x, (in the present case, of the tangent line) is defined by the 
images x — a and y — b lying in this ideal, that is, by the restriction of g to Mt... 
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Since e* = 0, g obviously vanishes on I?,. Hence @ defines a linear map of the 
space M,. /Mz, into C, and precisely this linear function determines a vector of 
the tangent line at xo. It is not hard to prove that any linear function M,,,/M?, — 
C defines a tangent vector at Xp. 


Theorem. The tangent space at a point Xo is the dual vector space to M,,,/Mz.,, 


where M,,, is the maximal ideal corresponding to xq. 


The same thing holds for an algebraic surface C with equation F(x, y, z) = 0: 
the tangent plane to C at a nonsingular point x, = (a,b,c) (that is, a point at 
which the three derivatives 


F(a, b), F,(a, b) and F;(a, b) 


do not all vanish simultaneously) can be identified with the dual vector space to 
Me, /Mi,. Later we will apply these arguments to an arbitrary algebraic variety, 
but for the moment we show that they also have applications outside the 
algebraic case. 


Example 1. Let A be the ring of differentiable functions in a neighbourhood 
ofa point O of an n-dimensional vector space E, and let M be the ideal of functions 
vanishing at O. By Taylor’s formula, f ¢ Mt can be represented in the form 
f =|mod M? where | is a linear function. Linear functions on E form the dual 
vector space E*, and we again get an isomorphism IN/M? = E*. If € ¢ E then 
1(€) can be interpreted as the partial derivative [(€) = 7 (0). 

A similar situation holds if A is the ring of differentiable functions on a 
differentiable manifold X and M consists of the functions vanishing at x, € X. 
Again we have IN/M? = T.*, where T,, is the tangent space at xo, and the 
isomorphism is given by 


l(é) = F(x) for €¢T,, andl =f + M?. (2) 


The preceding argument presupposed that we already had a definition of the 
tangent space of a differentiable manifold, but the argument can be reversed and 
turned into the definition of the tangent space, 


T,, = (Dt, /Mz,)*. (3) 


Thus ¢ € T,, is by definition a linear function / on Mt,, which is zero on Me. 
Setting | to be equal to zero by definition on constants, we get a function on the 
whole of A. It is easy to see that the conditions imposed on /| can be written as 


(af + Bg) =al(f)+ Big) for a,BeRandf,geA 
and (4) 
(fg) = I(f)g(xo) + 1g) f (xo). 
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In this form they axiomatise the intuitive notion of a tangent vector as ‘that with 
respect to which a function can be differentiated’ (as in (2)). The relation (3) or 
the equivalent conditions (4) gives perhaps the most intrinsic definition of the 
tangent space at a point of a differentiable manifold. 

In this connection it is natural to consider the notion of a vector field on a 
differentiable manifold. By definition, a vector field 0 assigns to any point x e X 
a vector (x) € T;,. For any function fe A and a point x eX, the vector 0(x) 
defines a number 6(x)(f), that is, a function g(x) = 0(x)(f). We write D(f) for 
this operator. The relations (4) show that 9 satisfies the conditions 


Haf + Bg) = aD f) + BA), 
and A fg) = f9(g) + AS)g. 


An operator of this type is called a first order linear differential operator. It is 
easy to see that in a coordinate system (x,,...,x,,) it can be written 


of 
Ox; 


(5) 


Ff) = a, (6) 


where a; = Y(x;). Conversely, every operator J satisfying (5) defines a vector field 
6 for which 


O(x)(f) = BF) (x). 
For any arbitrary ring A a derivation of A isa map 9: A > A which satisfies 
Ya + b) = Ha) + Hb), 
Gab) = aZ(b) + Ha)b. 


If B < A is a subring, we say that J is a derivation of A over B if D(b) = 0 for 
be B. Then Hab) = A(a)b for ae A, b € B. If we set 


(D, + D)(a) = D(a) + D,(a), 
(cZ)(a) = cH(a) fora,ceA 


then derivations of A over B form an A-module. 

We can thus say that the module of vector fields on a differentiable manifold 
X is by definition the module of derivations over R of the ring of differentiable 
functions on X. Together with the assertions of § 5, Examples 13 and 12, we now 
get an algebraic definition of all the basic notions: vector fields, differential 
1-forms and r-forms on a manifold. 

We now return to arbitrary commutative rings. In § 4 we formulated a general 
conception according to which the elements of an arbitrary commutative ring A 
can be viewed as functions on a ‘space’, the points of which are maximal ideals 
(or in another version, prime ideals) of the ring, and the homomorphisms 
A — A/I define the value of a ‘function’ ae A at the ‘point’ corresponding to 
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the maximal ideal It. Now we can make this connection deeper by assigning a 
tangent space to each point. For this, consider the maximal ideal Mt defining a 
point, and the quotient Dt/M*. Suppose that k = A/M is the ‘field of values’ at 
the point corresponding to Mt. For elements m € Mt and ae A the residue class 
of ammod M? depends only on the residue class of amodM, that is, on the 
element of k determined by a. This shows that I/IM? is a vector space over k. 
The dual vector space, that is, the set of k-valued linear functions on Mt/M? is 
the analogue of the tangent space at the point corresponding to Mt. 

This point of view is useful in the analysis of various geometric and algebraic 
situations. For example, if an irreducible algebraic curve C is given by an 
equation F(x, y) = 0, then for (a, b) € C the tangent space is given by the equation 


F(a, b)(x — a) + Fi(a,b)(y — b) = 0. 


This is 1-dimensional for all points (a, b), except for points at which F,(a, b) = 
F(a, b) = 0. We say that a point of C is singular if both F, and F, vanish there, 
and nonsingular otherwise. It is easy to see that the number of singular points is 
finite. We see that the tangent space is 1-dimensional (that it, it has the same 
dimension as C) for nonsingular points, and has bigger dimension (namely 2) for 
singular points. A similar situation holds for more general algebraic varieties: 
the dimension of the tangent spaces is the same at all points, except at the points 
of a certain proper algebraic subvariety, at which itjumps up. This gives us, firstly 
a new characterisation of the dimension of an irreducible algebraic variety (as 
the dimension of the tangent spaces at all points except those of some proper 
subvariety); secondly, it distinguishes the singular points (the points of this proper 
subvariety); and thirdly, it gives an important invariant of a singular point (the 
jump in dimension of the tangent space). But perhaps most remarkable of all is 
that these notions are applicable to arbitrary rings, not necessarily geometric in 
origin, and allow us to use geometric intuition in their study. For example, the 
maximal ideals of the ring of integers Z are described by prime numbers, and for 
M = (p) the vector space Mz/M? is 1-dimensional over F,, so that here singular 
points do not occur. 


Example 2. Consider the ring A consisting of elements a + bo with a, be Z, 
with operations defined on them as usual, together with the condition o? = 1 
(this ring turns up in connection with the arithmetical properties of representa- 
tions of the group of order 2). Its maximal ideals can be described as follows. For 
any prime number p # 2 we have two maximal ideals 


M, = {a + bo|p divides a + b} 
and 
MU, = {a + bo|p divides a — b}. 


Obviously, M, = (p,1 — o) and Mt, = (p, 1 +c). For each of these, the space 
IM/M? is 1-dimensional over F,. In addition, there exists a further maximal ideal 
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Mt, = {a + bola and b have the same parity} = (2,1 + 0). 


It is easy to see that INZ = (4,2 + 2c) and that M,/MF consists of 4 elements, 
the cosets of the elements 0, 2, 1 + o and 3 + oc. Thus this is a 2-dimensional 
vector space over F,. The ideal St, corresponds to the unique singular point. 

All of our considerations so far have been connected with considering quanti- 
ties ‘up to infinitesimals of order 2’, which for an arbitrary ring A and its maximal 
ideal Mt reduces to considering the ring A/IM*. Of course, it is also possible to 
consider quantities ‘up to infinitesimals of order r’, which leads to the ring A/M’. 
For example, if A is the polynomial ring C[x,,...,x,] or the ring of analytic 
functions of variables z, ..., z, in a neighbourhood of the origin, or the ring of 
C” complex-valued functions in n variables, and MM is the ideal of functions which 
vanish at the origin O = (0,...,0) then A/M’ is a finite-dimensional vector space 
over C. It generalises the space A/IM? we have already considered, and is called 
the space of jets of order (r — 1). 


Example 3. Differential Operators of Order > 1. A linear differential operator 
of order <r ona differentiable manifold X can be defined formally as an R-linear 
map Y: A — A of the ring A of differentiable functions on X to itself such that 
for any function g € A the operator 9,(f) = D(gf) — gA(f) has order <r — 1. 
Formula (5) defining a first order operator shows that D(gf) — gA(f) is the 
operator of multiplying by a function (namely, P(g)); conversely if B(gf) — gA(f) 
is multiplication by a function then it is easy to check that W(/f) = Af) + AL), 
where @& 1s a first order operator. 

From the definition it follows by induction that if Z is a operator of order <r 
then B(ME,**) < M,,., where M,.. < A is the maximal ideal corresponding to a 
point x, € X. In coordinates this means that A f)(x_.) depends only on the values 
at X, of the partial derivatives of f of order <r. In other words, we have 


Git tin 
Af) = » Gi, ...i,(%15+++5%n) u 


———~, witha, , €A. 
ipt-- +i,<r Oxi... Oxin’ rosin 


For any point x, € X the map f(x)» D(f)(xo) defines a linear function | on 
the space of all jets of order r: | € (A/M’)*, in exactly the same way that a first 
order linear differential operator defines a linear function on M,, /Mz.. 

However, the most precise apparatus for studying the ring A ‘in a neighbour- 
hood of a maximal ideal MV? is obtained if we consider simultaneously all the 
rings A/IN" for n = 1, 2, 3,... They can all be put together into one ring A called 
the projective limit of the A/S". For this we observe that there exists a canonical 
homomorphism 9,: A/I"*! > A/M" with kernel M"/M"*). The ring A is defined 
as the set of sequences of elements {a,|«, € A/Mt"} which are compatible in the 
sense that ¢,(«,41,) = «,; the ring operations on sequences are defined element- 
by-element. Each element a e€ A defines such a sequence, by a, = a + IN", and 
we thus get a homomorphism g: A — A. The kernel of ¢ is the intersection of all 
the ideals SN”. In many interesting cases this intersection is 0, and hence A embeds 
in A asa subring. 
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Example 4. Let A = K[x], and 9 = (x). An element «, of the ring K[x]/(x") 
is uniquely determined by a polynomial 


fn = Ap HA,X +57 + ,-1 x", 


and a sequence of elements {«,,} is compatible if the polynomial f,,, representing 
2,4, 1S obtained from f, by adding in a term of degree n. The whole sequence 
thus defines an infinite (formal) power series. In other words, the ring A is 
isomorphic to the ring of formal power series K[x] of §3, Example 6. The 
inclusion 9: K[x]— K[x] extends to an inclusion of the fields of fractions 
oy: K(x) > K((x)), where K((x)) is the field of formal Laurent power series (§ 2, 
Example 5). It is easy to see that this inclusion is the same thing as sending a 
rational function to its Laurent series at x = 0. In particular, if a function does 
not have a pole at x = 0 then it is sent to its Taylor series. For example, if 
f(x) = 141 — x) then f(x) = 1 +x +--+ +x"! mod x”, or in other words, the 
function f(x) — 1 — x —-:: — x""' has denominator not divisible by x, and nu- 
merator divisible by x". This means that f(x) is sent to the series 1 + x + x* + °°: 


Example 5. Let A = K[C] be the coordinate ring of an arbitrary algebraic 
variety C. If I, is the maximal ideal of A corresponding to a nonsingular point 
céC, then A is isomorphic to the ring K[x,,...,x,] of formal power series, 
where n is the dimension of C (in any of the definitions of this notion discussed 
above). Moreover, the inclusion 


K[C] > K[x,,...,x,] 


extends to those functions in K(C) that are finite at c, that is, can be represented 
as P/Q where P, Q € K[C] and Q(c) #0. This gives a representation of such 
functions as formal power series. If K is the complex or real number field C or 
IR then it can be proved that the corresponding functions converge for sufficiently 
small values of x,,..., x,. This is how one proves that an algebraic variety 
without singular points is also a topological, differentiable and analytic manifold. 


Example 6. Let A be the ring of C® functions in a neighbourhood of x = 0, 
and I the ideal of functions that vanish at x = 0. Then !" is the ideal of functions 
that vanish at x = 0 together with all of their derivatives of order <n; A/I" is the 
ring R[x]/(x"), and the homomorphism A > R[x]/(x") takes a function into its 
Taylor series. In this case () I” 4 0, since there exist nonzero C” functions all of 
whose derivatives vanish at x = 0. The homomorphism A — A takes each func- 
tion to its formal Taylor series. Since by a theorem of E. Borel there exist C® 
functions all of whose derivatives at x = 0 take preassigned values, A ~ R[t]. 

But the same ideas can also be applied to rings of a completely different nature. 


Example 7. Suppose that A = Z is the ring of integers and It = (p) for some 
prime number p. As A we get a ring Z, called the ring of p-adic integers. By 
analogy with the case of the ring K[x] considered above, one can see that an 
element of Z, is given as a sequence {«,} of integers of the form 
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Xn = ao + a,Pp + “7 + a,-1p" '; 


where the a; belong to the fixed system 0 < a; < p of representatives of the classes 
of residues mod p, and «,,, is obtained from «, by adding on a term a,p”. This 
sequence can be written as a formal series 


Ay + ap + anp* +": 


The ring operations on these sequences are carried out in exactly the same way 
as the operations on integers written in base p; that is, if operating on the 
coefficient a; we get a number c > p, we must divide c by p with remainder 
C = Co + c,p and ‘carry c, into the next place’. The ring Z, is integral, and its 
field of fractions Q, is the field of p-adic numbers. The inclusion Z <, Z, extends 
to an inclusion Q c, Q,. 

To get a more rounded view of the relation between the constructions 
described above, we return to the example of the ring K[x] and the field K(x). 
For a more precise numerical characterisation of the fact that a nonzero function 
jf € K(x) vanishes to a given order at x = 0, we introduce the exponent v(f), 
equal to nif f has a zero of order n > O at x = 0, or to —nif f has a pole of order 
n > (Q at x. We fix a real number c with 0 < c < 1 once and for all (for example, 
c = 4), and set o(f) =c’™ for f 40, and o(0) = 0. Then g(/) is small if f 
vanishes to a high order at x = 0. The expression g(f) we have introduced has 
the formal properties of the absolute value of a rational, real or complex number: 
o(f) = Oif and only if f = 0, and 


(fg) = o(S)e(g), olf +9) < O(f) + O(g). (7) 


We say that a field L having a real-valued function ¢ with these three properties 
isa normed field and the function ¢ a valuation. The simplest example of a normed 
field is the rational number field Q with g(x) = |x|. The procedure of constructing 
the reals starting from the rationals, by means of Cauchy series, can be taken 
over word-for-word to any normed field. We get a new normed field L, into which 
L embeds as a subfield with the valuation preserved, such that the image of L is 
everywhere dense; and L is complete (in the sense of its valuation), that is, it 
satisfies the Cauchy convergence criterion; L is called the completion of L with 
respect to the valuation 9. 

It is very easy to see that the construction of the field K((x)) and of the 
embedding K(x) — K((x)) is an application of the general construction to the 
case of the valuation o(f) = c’ introduced above. Now we can use the fact 
that the field K(X)* = K((x)) has a valuation extending the valuation @ of K(x). 
It is easy to see what this is: if f € K((x)) and 


f = C_X" + Copy x") +°°: with, #0 


then g(f) = c”, and ¢~(0) = 0. But in a normed field the convergence of series 1s 
meaningful, and it is easy to see that any formal Laurent series converges in this 
sense; in particular, x” + 0 as n — oo in the sense of our theory. Taking a rational 
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function f into its Laurent series now turns into an equality, in the sense that in 
K(X)* = K((x)), f is equal to the sum of the series converging to it. 

In this connection it is interesting to determine quite generally which valua- 
tions can be defined on the field K(x). We restrict ourselves to the case of the 
complex number field K = C, and strengthen the notion of valuation by adding 
a further condition to (7): 


g(a)=1 if aweC and a0. (8) 


Obviously the valuation g(f) = c’” we have constructed satisfies this extra 
condition. Of course, we could vary our construction, considering any point 
x = ain place of x = 0, that is, defining v(f) as the order of the zero or pole of 
a function f at x =a. The valuation so obtained is denoted by g,. We can 
consider another similar valuation by considering the order of zero or pole of a 
function at infinity; we denote this valuation by ¢,,. It can most simply be defined 


P 
by 9,(f) =c" "if f= 0 and P, Q are polynomials of degree n, m respectively 


(and of course ~(0) = 0). 
It is not hard to see that these valuations exhaust all the valuations of C(x). 


Theorem I. All valuations of C(x) (with the extra condition (8)) are given by the 
valuations 9, for « € C, and the valuation @,,. 


Thus the valuations of C(x) give us in a very natural way all the points of the 
line (including the point at infinity), or of the Riemann sphere, on which the 
rational functions are defined. 

We now ask the same question for finite extensions of C(x). These are of the 
form C(C), where C is some irreducible curve. The answer turns out to be similar, 
but rather more delicate. Every nonsingular point c of the curve C corresponds 
to some valuation 9,, characterised for example by the fact that g,(/) < 1 if and 
only if f(c) = 0. But there are a finite number of valuations to be added to these; 
firstly, the points at infinity of the curve C (which occur if we consider a curve in 
the projective plane). Secondly, singular points of C may correspond to several 
distinct valuations. The entire set of valuations is in 1-to-1 correspondence with 
the points of a certain nonsingular curve lying in projective space, and defining 
the same field C(C), the so-called nonsingular projective model of C. The points 
of this model are thus characterised in a very remarkable way quite intrinsically 
by the field C(C). Another way of stating the same description 1s that if the curve 
C is given by the equation F(x, y) = 0 then all valuations of C(C) are in 1-to-1 
correspondence with the points of the Riemann surface of the function y as an 
analytic function of x. This can be considered as a purely algebraic description 
of the Riemann surface of an algebraic function. 

Let € = (a, b) be some point of an algebraic curve C with equation F(x, y) = 0, 
and @ one of the valuations corresponding to €. Then the completion of C(C) 
with respect to the valuation @ is again isomorphic to the field C((t)) of formal 
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Laurent series. Suppose that under the inclusion C(C) c C((0)), 
x-a=c,t*+q,,t°'+°-:°, with c, 40. 
Then x — a = t*f(t) with f(0) # 0. Thus 
x-a=t, 
where 
t = of(t)", 


and f(t)" must be understood as a formal power series, which is meaningful in 
view of the condition f(0) 4 0. It is easy to show that t is a ‘parameter’ of the 
field C((t)) as well as t; that is, all elements of C((t)) can be represented as Laurent 
series in t also, so that C((t)) = C((t)). In particular, 


y=).d,t' =) d(x — a)". 


This type of expansion of an algebraic function y as a fractional power series in 
x — ais called a Puiseux expansion. 

We now proceed to the rational number field Q. Let p be a prime, and c a real 
number with 0 < c < 1. We write v(n) for the highest power of p which divides 


. n., 
n, and for a rational number a = — with n, me Z, we set 
m 


(a) = c%™—Ve™, 


It is easy to check that g, is a valuation on the rational number field Q. 
Considering the completion of Q in this valuation, we arrive at the p-adic number 
field Q, which was introduced earlier. In it, the notion of convergence of series 
makes sense, and the formal power series which we used to specify p-adic 
numbers are convergent. For example, the equality 


| 2 
——=1+pt+pet+::: 
1 —p 
has the meaning that the number on the left-hand side is the sum of the conver- 
gent series on the right. 


By analogy with the field C(x) it is natural to ask: what are all the valuations 
of Q? 


II. Ostrowski’s Theorem. Every valuation of © is either a p-adic valuation 9, 
or a valuation of the form ~(a) = |a\°, where c is a real number withO <c < 1. 


The number c here is an inessential parameter, exactly as that occuring in the 
definition of a p-adic valuation or of the valuation 9, of C(x): valuations obtained 
for different choices of c define the same notion of convergence and isomorphic 
completions. The completion with respect to the valuation | |° gives of course 
the real number field. Thus all the p-adic number fields Q, and the real number 
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field R play entirely similar roles. The comparison with the field C(x) shows that 
primes p (defining the fields Q,) are analogous to finite points x = a, and the 
inclusions Q — Q, are analogous to expansions in Laurent series at finite points; 
then the inclusion @ — R is an analogue of the Laurent expansion at infinity. 
This gives a unified point of view on two types of properties of integers (or 
rational numbers): divisibility, and size. For example, for f € Z[x], the fact that 
the equation f(x) = 0 has a real root means that there exist rational numbers a, 
for which | f(a, )| 1s arbitrarily small. In the same way, the fact that f(x) is solvable 
in the p-adic field means that there exist rational numbers a, for which ¢,(f(a,)) 
is arbitrarily small, that is, that f(a,) is divisible by larger and larger powers of 
p. It can be shown that for a polynomial f(x,,...,x,) the solvability of the 
equation 


f(X1,---X,) = 0 
in Q, is equivalent to the solvability of the congruence 
f(X1,...,X,) = 0 mod p* 


for any k. Since a congruence to any modulus reduces to congruences mod p*, 
the solvability of the equation f = 0 in all the fields Q, is equivalent to the 
solvability of the congruence 


f=0 modN 


for any modulus N. For example, the following assertion is a classical result of 
number theory. 


Ill. Legendre’s Theorem. The equation 
ax? + by? =c (fora,b,ceéZandc > 0) 


is solvable in rational numbers if and only if the following conditions hold: 

(1) eithera>Oorb> 0; 

(2) the congruence ax? + by? =cmodN is solvable for all N. 

By what we have said above, this means that the equation ax? + by? = c is 
solvable in rationals if and only if it is solvable in each of the fields Q, and R. 


This result can be generalised. 
IV. Minkowski-Hasse Theorem. The equation 
F(X 15-655 Xn) = ¢, 


where f is a quadratic form with rational coefficients, and c € Q, is solvable in Q 
if and only if it is solvable in all the fields Q, and R. 


The p-adic number field reflects arithmetic properties of the rational numbers 
(divisibility by powers of p), but on the other hand, it has a number of properties 
in common with the field R; in Q, we can consider measures, integrals, analytic 
functions, interpolation and so on. This gives a powerful number-theoretic 
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method (especially if all the fields Q, and R are considered together), the use of 
which has led to a large number of deep arithmetic results. 

In conclusion, we consider a finite extension field K of Q; a field of this type 
is called an algebraic number field. What valuations are there on K? Every 
valuation induces a certain valuation of Q, and one can prove that any valuation 
of Q is induced by a finite number of valuations of K. Those which induce the 
usual absolute value |a| on Q are related to embeddings of K into the real field 
R or the complex field C and the function |x| on these fields. We consider other 
valuations. In Q the subring Z is distinguished by the conditions @,(a) < 1 for 
all p. By analogy, consider the elements of K satisfying g(a) < 1 for all the 
valuations of K inducing the valuations 9, of Q for some prime p. One sees easily 
that these elements form a ring A, which plays the role of the ring of integers of 
K; the elements of A are called algebraic integers. (It can be proved that « € K 
is an algebraic integer if and only if it satisfies a equation 


a"ta a” '+---+a,=0 with a,,...,a,EZ; 


this is often taken as the definition of an algebraic integer.) The field of fractions 
of A equals K. Obviously, A > Z. It can be proved that A is a free module over 
Z, of rank equal to the degree [K : Q] of the extension K/Q. The ring A is in 
general not a unique factorisation domain, but the theorem on unique factorisa- 
tion of ideals into a product of prime ideals holds in it. In particular, for any 
prime ideal p and element « € A there is a well-defined exponent v(a) which tells 
us what power of p divides the principal ideal («). We choose a real number c 


withO < c < l,and for any element € € K,é # 0, write € = with a, 8B € Aand set 


Pp(€) — cVia— vB) 


Thus to each prime ideal p of A we assign a valuation g,. It turns out that these 
exhaust all the valuations of K that induce one of the valuations g, on Q. These 
facts make up the first steps in the arithmetic of algebraic number fields. Com- 
paring them with the analogous facts which we have discussed above in connec- 
tion with the fields C(C) for an algebraic number field C, we can observe a 
far-reaching parallelism between the arithmetic of algebraic number fields and 
the geometry of algebraic curves (or properties of the corresponding Riemann 
surfaces). This is a further realisation of the ‘functional’ point of view of numbers 
which we discussed in § 4 (see the remark after Example 3). 


§8. Noncommutative Rings 


The set of linear transformations of a finite-dimensional vector space has two 
operations defined on it, addition and multiplication; writing out linear trans- 
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formations in terms of matrixes, these operations can be transferred to matrixes 
as well. The existence of both these operations is extremely important and is 
constantly used. It is, for example, only because of this that we can define 
polynomials in a linear operator; and, among other uses, they are used in the 
study of the structure of a linear transformation, which depends in an essential 
way on the multiplicity of roots of its minimal polynomial. The same two 
operations, together with a passage to limits, make it possible to define analytic 
functions of a (real or complex) matrix. For example, 


by writing out a system of n first order linear ordinary differential equations with 
constant coefficients in n unknowns in the form - = Ax, where x is the vector 
of unknown functions and A the matrix of coefficients, this allows us to write the 
solution in the form x(t) = e“'x,), where x, is the vector of initial data. 

The operations of addition and multiplication of linear transformations are 
subject to all the axioms of a commutative ring, except commutativity of multi- 
plication. Omitting this requirement from the definition of a commutative ring, 
we also omit the adjective ‘commutative’ in the name of the new notion. 

Thus, a ring is a set with operations of addition and multiplication, satisfying 
the conditions: 

a+b=b+a, 


a+(b4-c)=(a+b)+¢, 
(ab)c = a(be), 
a(b + c) = ab+ ac 
(b + cla = ba + ca. 


There exists an element 0 such that a + 0 = 0 + a = a for all a; for any a there 
exists an element — a with the property a + (— a) = 0. There exists an element 1 
such that 1-a = a: 1 =a for all a. 

We now give some examples of rings (noncommutative ones; we have already 
seen any number of commutative ones). 


Example 1. The ring of linear transformations of a vector space L, and its 
natural generalisation, the ring of all homomorphisms of a module M to itself 
over a commutative ring A. Homomorphisms of a module to itself are called 
endomorphisms, and the ring defined above is denoted by End, M. If A = K isa 
field, we get the ring of linear transformations of a vector space L, which we will 
also denote by End, L. 


Example 2. The simplest infinite-dimensional analogue of the ring of linear 
transformations is the rings of bounded linear operators in a Banach space. 
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Example 3. The ring of linear differential operators in 1 or n variables, whose 
coefficients are polynomials, or analytic functions, or C~ functions, or formal 
power series (in the same number of variables, of course). 


Before proceeding to consider further examples, we note those notions which 
we introduced for commutative rings, but which did not in fact use commuta- 
tivity. These are: isomorphism, homomorphism, kernel and image of a homomor- 
phism, subring, graded ring. 

For example, choosing a basis in an n-dimensional vector space L over a field 
K determines an isomorphism of the ring End, L with the ring of n x n matrixes, 
which we denote M,(K). 

In a ring R the set of elements a commuting with all elements of R (that is, 
ax = xaforall x e R)formsasubring, called the centre of R, and denoted by Z(R). 

If the centre of a ring R contains a subring A then we say that R is an A-algebra 
or an algebra over A. Forgetting about multiplication in R and considering only 
multiplication of elements of R by elements of A turns R into an A-module. The 
notion of homomorphism of two algebras over a commutative ring A differs from 
an ordinary ring homomorphism in that we insist that each element of A is taken 
into itself, that is, that the homomorphism defines a homomorphism of the 
corresponding A-modules. The notion of subalgebra of an algebra R over A 1s 
defined in the same way: it should be a subring containing A. 

If A = K 1s a field and R is an algebra over K then the dimension of R as a 
vector space over K is the rank of the algebra R. We have already met this notion: 
a finite extension L/K is an extension which is an algebra of finite rank. An 
algebra of finite rank n over a field K has by definition a basis e,, ..., e,, and 
multiplication in the algebra is determined by the multiplication of elements of 
this basis. Since e;e; is again an element of the algebra, it can be written in the form 


ee; =) Cine, With ci, € K. (1) 


The elements c;;, are called the structure constants of the algebra. They determine 
multiplication in the algebra: 


‘O39 aie:)(), b,e;) = y A:D;C; jp, 


The relations (1) are referred to as the multiplication table of the algebra. Of 
course, the structure constants cannot be given in an arbirary way: they have to 
satisfy the conditions that reflect the requirement that multiplication is associa- 
tive and there exists a unit element. 

For example, the matrix ring M,(K) is an algebra of rank n? over K. As a basis 
we can take the n’ elements E,,, where E,; is the matrix with all entries equal to 
0 except for the entry in the ith row and jth column, which is 1. Its structure 
constants are determined by 


Eg eéa = 0 if j f k, 
EE, = Ej. 


J 


(2) 


64 §8. Noncommutative Rings 


Now we can introduce some more examples of rings, given most simply as 
algebras over a field. 


Example 4. Let G be a finite group (we assume that the reader is familiar with 
this notion, although in any case it is recalled in § 12). We construct an algebra 
over a field K whose basis elements e, for g € G are indexed by elements of the 
group, and which multiply together as elements of G: 


[91 € 9. = —aig2: 


The algebra so obtained is called the group algebra of G, and is denoted by K[G]. 
In the same way, we can define the group algebra A[G] ofa finite group G over 
a commutative ring A. Identifying elements g € G with the corresponding basis 
elements e,, we can view the elements of K[G] as sums ) a,g. The product 


geG 
(5 2 (x ph) can of course also be written in the same form y Ys 


geG geG 

where, as is easy to check, 

Yg = » Oy Pu-rg- (3) 

ueG 

An element y a,g is determined by its coefficients, which we can view as func- 
tions on G, and write accordingly «(g). We then get an interpretation of K[G] 
as the algebra of functions on G, with multiplication taking functions «(g), B(g) 
into the function y(g) given as in (3) by 


y(g) = > a(u)B(u~"g). (4) 
This notation is the starting point for generalisations to infinite groups. For 
example, if G is the unit circle |z| = 1, writing elements of G in terms of their 
argument , we see that a function on G is just a periodic function of @ with 
period 27. By analogy with formula (4), the group algebra of our group is defined 
as the algebra of periodic functions a(@) (for example continuous and absolutely 
integrable) with the multiplication law which takes a(q@), B(@) into the function 


1 2n 
Wf) = an | a(t) B(@ — t)dt. 
NM Jo 


In analysis, this operation is called the convolution of two functions. 

This definition fails in one formal respect: the group algebra does not contain 
the identity element, which is the delta-function of the unit element. We can easily 
overcome this failure by adjoining a unit to R, that is, considering C © R with 
multiplication (« + x)(B + y) = aB + (ay + Bx + xy). 

Another way of generalising the notion of group algebra to infinite groups is 
applicable to countable groups, and is related to considering series instead of 
functions: we consider infinite series (for example, absolutely convergent) of the 
form )\ a,g with a, € C, and the multiplication law given by (3). 
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Example 5. The most famous example of a noncommutative ring is the qua- 
ternion algebra H. This is an algebra of rank 4 over the field of real numbers R 
with basis 1, i, j, k, having the multiplication law 


7=j*=k?=—-I, ij = k, ji= —k, jk =i, kj = —i, ki = j, ik = —j, 


that is, if we write i, j, k around the circle, 


(—™ 


k ¢ 


Fig. 10 


then the product of two adjacent elements taken in clockwise order is equal to 
the third, and taken anticlockwise is equal to minus the third. 

The modulus (or absolute value) of a quaternion g=a-+t bi+ cj + dk is 
the number |g| = ./a? + b? + c? + d?; the conjugate of q is the quaternion 
q =a — bi — cj — dk. The relations 


qq=4q=\q\l’ and G14 =% (5) 
are easy to check. It follows from these that if gq 4 0 then the quaternion q™' = 


1 _. ; 
ged is an inverse of q, that is, gg’ =q ‘q=1.Ifq=a+ bit+cj + dk thena 


is called the real part of q and bi + cj + dk the imaginary part; they are denoted 
by Req and Img. If a = 0 then q is purely imaginary. In this case it corresponds 
to a 3-dimensional vector x = (b,c,d). The product of two purely imaginary 
quaternions can expressed in terms of the two basic algebraic operations on 
3-dimensional vectors, the scalar product (x, y) and the vector product [x, y]; in 
fact if purely imaginary quaternions p and q correspond to vectors x and y then 
Re(pq) = (x, y) and Im(pq) corresponds to the vector [x, y]. 

From the equalities (5) it follows easily that |q,q2| =|q,|:|q2| for two qua- 
ternions qg, and q,. This means that if a, b, c, d and a,, b,, c,, d, are arbitrary 
numbers, then the product 


(a? + b* +c? + d*)(a? + b? + c? + d?) 


can be written in the form a} + b? + c3 + d3, where a, by, c,, d, (which are the 
coefficients of 1, i, j, k in the quaternion q,q,), can be expressed very simply in 
terms of a, b, c, d and a,, b,, c,, d, (the reader can easily write out these ex- 
pressions). The resulting identity was discovered by Euler long before Hamilton’s 
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introduction of quaternions; it is useful, for example, in the proof of Lagrange’s 
famous theorem that any natural number n is equal to a sum of squares of four 
integers: using this identity, the problem reduces at once to the case of a prime 
number n. 


Example 6. The quaternions contain the field C of complex numbers, as 
elements of the form a + bi. Any quaternion can be uniquely written in the form 
Z, +2,j with z,, z, € C. This expression 


H=C@C (6) 


gives a convenient way of representing quaternions. When handling quaternions 
written in this form, we need only remember that z € C and j do not commute. 
However, it is easy to check that their commutation is subject to the simple rule 


jz = Zj. (7) 


The representation (6) has one important geometric application. Suppose 
we consider pairs (q,,q,) # (0,0) with q,, q, € H and identify pairs which are 
proportional ‘on the left’: (q,,q2) ~ (941,442) for q # 0. We obtain the quater- 
nionic projective line P'(H). Just as the real and complex projective plane, it 
contains a finite part, the pairs (q,,q,) with q, #0, which we can identify 
with H (by taking q, = 1), and P'(H) is obtained from H by adding the point at 
infinity (q,,0). This shows that, as a manifold, P'(H) is diffeomorphic to the 4- 
dimensional sphere S*. Representing H in the form (6) and setting q, = z; + Z2j, 
q>. = 23 + Z4j, we replace the pair (q,,q,) by the 4-tuple (z,,2,,23,24) in which 
not all z; are zero. These 4-tuples, considered up to nonzero complex multiples, 
form the 3-dimensional complex projective space P3(C). Both P!(H) and P3(C) 
are obtained from the same set of pairs (q,,q,), but by means of different 
identification processes, differing by the choice of proportionality factors: q e H 
in the first case, and g € C in the second. Since pairs identified in the second case 
are obviously also identified in the first, we get a map 


P3(C) > S*. 


This is the twistor space over the sphere S*, which is very important in geometry; 
its fibres form a certain 4-dimensional family of lines of P?(C). It allows us to 
reduce many differential-geometric questions concerning the sphere S* to 
questions of complex analytic geometry of P?(C). 

Other applications of quaternions, to the study of the groups of orthogonal 
transformations of 3- and 4-dimensional space, will appear in § 15. 

A ring in which any nonzero element a has an inverse a™' (that is an element 
such that aa~! = a~'!a = 1) is a division algebra or skew field. In fact it is enough 
to assume only the existence of a left inverse a~!, such that a~'a = 1 (or only a 
right inverse). If a’ is a left inverse of a and a” a left inverse of a’ then, by 
associativity, a”a’a is equal to both of a and a”. This gives aa’ = 1, so that a’ is 
also a right inverse. A field is a commutative division algebra, and the quaternions 
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are the first example we have met of a noncommutative division algebra. It is 
easy to check that there is only one inverse in a division algebra for a given 
element. In a division algebra any equation ax = b with a # 0 can be solved: 
x = a’'b; similarly, for ya = b with a 4 0, y = ba™!. 

The standard notions of linear algebra over a field K carry over word-for-word 
to the case of vector spaces over an arbitrary division algebra D. We observe 
only the single distinction, which is significant, although formal: if a linear 


transformation @ of a n-dimensional vector space over a division algebra is given 


in a basis e,,..., e, by a matrix (a,,) and y by a matrix (b,,) then, as one can easily 
check, the transformation gy is given by the matrix c,,, where 
Ci = » Dy Aix: (8) 


In other words, in the usual formula for multiplying matrixes, we must inter- 
change the order of the terms. (This can already be observed in the example of 
1-dimensional vector spaces!) 

In this connection we introduce the following definition. 

Rings R and R’ are said to be opposite or skew-isomorphic if there exists a 1-to- 
1 correspondence a« a’ between a € R and a’ € R’ with the properties that 


a,oa;, anda,-a, > a, +a, a, + a) and a,a, = a5}. 


A correspondence a< a’ which establishes a skew-isomorphism of R with itself 
is called an involution of R. Examples are the correspondences a <> a* (where a* 
is the transpose matrix) in the matrix ring M,(A) over a commutative ring A, 
>, %g <> > a,g ' in the group ring A[G], and qq in the quaternion algebra H. 

For each ring R there exists an opposite ring R’ skew-isomorphic to R. To get 
this, we simply take the set of elements of R with the same addition and define 
the product of two elements a and b to be ba instead of ab. 

Now we can describe the result expressed in (8) above as: 


Theorem I. The ring of linear transformations of an n-dimensional vector space 
over a division algebra D is isomorphic to the matrix ring M,(D’) over the opposite 
division algebra D’. 


With the exception of this alteration, the well-known results of linear algebra 
are preserved for vector spaces over division algebras. Going further, we can also 
define the projective space P"(D) over D, and this will again have most of the 
properties we are familiar with. 


Example 7. We consider the space T’(L) of contravariant tensors of degree r 
over an n-dimensional vector space L over a field K (see § 5 for the definition of 
the module T"(M)). The tensor product operation defines the product of tensors 
pe T'(L) and ye T*(L) as a tensor 9 @ We T’**(L). To construct a ring by 
means of this operation, consider the direct sum G)T’(L) of all the spaces 
T’(L), consisting of sequences (@,@,,...), with only a finite number of non- 
zero terms, and g,e T’(L). We define the sum of sequences component-by- 
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component, and the product of (@p, @,...) and (Wo, W;,...) aS (€9, €,,...), where 


E,= > 9,¥,--. It follows from properties of multiplication of tensors that 
O<r<p 
we get a ring in this way. It contains subspaces T’(L) for r = 0, 1,..., and each 


element can be represented as a finite sum @) + g, + °:: + Q, with g, € T’(L). 
The elements ~, € T°(L) = K are identified with elements of K, so that the ring 
constructed is a K-algebra. It 1s called the tensor algebra of the vector space L, 
and denoted by T(L). The decomposition of T(L) as the sum of the T’(L) makes 
T(L) into a graded algebra. 

Let us choose a basis €,, €,,..., €, of T/(L) = L. The well-known properties 
of tensor multiplication show that the products ¢; -...°¢;_, where (i,,...,i,) 18 
any collection of m indexes, each of which can take the values 1, ..., n, form a 
basis of T”(L). Hence all such products (for all m) form an infinite basis of the 
tensor algebra over K. Thus any element of the tensor algebra can be written as 
a linear combination of products of the elements €,, €,, ..., €, and different 
products are linearly independent (the order of the factors is distinguished). In 
view of this, T(L) 1s also called the noncommuting polynomial algebra in n variables 
€,,..-,€,. As such it is denoted by K<é,,...,&,. 

The characterisation of the algebra T(L) indicated above has important appli- 
cations. We say that elements {x, } (finite or infinite in number) are generators of 
an algebra R over a commutative ring A if any element of R can be written as a 
linear combination with coefficients in A of certain products of them. Suppose 
that an algebra R has a finite number of generators x,, ..., x, over a field K. 
Consider the map which takes any element« = ia, ¢;,°..." €;, of the algebra 
K¢é,,...,€,) into the element « = )'a;,._;,X;,"-..'X;, of R. It is easy to see that 
we thus get a homomorphism K<é,,...,&,) — R whose image is the whole of R. 
Thus any algebra having a finite number of generators is a homomorphic image 
of a noncommuting polynomial algebra. In this sense, the noncommuting poly- 
nomial algebras play the same role in the theory of noncommutative algebras as 
the commutative polynomial algebras in commutative algebra, or free modules 
in the theory of modules. 

We must again interrupt our survey of examples of noncommutative rings to 
get to know the simplest method of constructing them. As in the case of commuta- 
tive rings, it is natural to pay attention to properties enjoyed by kernels of 
homomorphisms. Obviously, if g@: R > R’ is a homomorphism, then Ker @ con- 
tains the sum a + b of two elements a, b € Ker g, and both of the products ax 
and xa of an element a € Ker ¢ with any element x € R. We have run up against 
the fact that the notion of ideal of a commutative ring can be generalised to the 
noncommutative case in the three ways (a), (b), (c) below. Consider a subset ] c R 
containing the sum a + b of any two elements a, b € I 

(a) If the product xa is contained in J for any a € I and x € R, then we say that 
I is a left ideal; 

(b) if (under the same conditions) ax is contained in J then we say that I isa 
right ideal; 
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(c) if both conditions (a) and (b) hold, we say that I is a two-sided ideal. Thus 
the kernel of a homomorphism is a two-sided ideal. 

We give examples of these notions. In the ring of linear transformations of a 
finite-dimensional vector space L over a division algebra D, a subspace V c L 
defines a left ideal , J, consisting of all transformations ¢ such that g(V) = 0, and 
a right ideal I,,, consisting of all @ such that g(L) c V. In the ring of bounded 
linear operators in a Banach space, all compact (or completely continuous) 
operators form a two-sided ideal. 

All elements of the form xa with x € R form a left ideal, and those of the form 
ay for ye Ra right ideal. For two-sided ideals the corresponding construction 
is a little more complicated. We treat this at once in a more general form. Let 
{a,} be a system of elements of a ring R; all sums of the form x,a,,yy +°°° + 
X,Qy,y, With x;, y; € R form a two-sided ideal. It is called the ideal generated by 
the system {a,}. 

In complete analogy with the commutative case we can define the cosets of a 
two-sided ideal and the ring of these cosets. We preserve the previous notation 
R/I and the name quotient ring for this. For example, if R is the ring of bounded 
linear operators in a Banach space and I[ is the two-sided ideal of compact 
operators, then many properties of an operator g depend only on its image in 
R/I. Thus to say that @ satisfies the Fredholm alternative is equivalent to saying 
that its image in R/J has an inverse. 

The homomorphisms theorem is stated and proved in complete analogy with 
the commutative case (§ 4, Theorem II). 

Let {y,} be a system of elements of the noncommuting polynomial algebra 
K<é,,...,€,> and J the two-sided ideal it generates. In the algebra R = 
K<é,,...,¢,>/I, we write a,,..., a, for the images of the elements €,, ..., é,. 
These are obviously generators of R; we say that R is defined by generators 
a,,..., a, and relations g, = 0. By the homomorphisms theorem, any algebra 
with a finite number of generators can be defined by some system of generators 
and relations. But although the system of generators is by definition finite, it 
sometimes happens that the system of relations cannot be chosen to be finite. 

The commutative polynomial ring K[x,,...,x,] has the defining relations 
X;X; = x;x;. Let R be the ring of differential operators with polynomial coeffi- 
cients in n variables x,,..., x, Generators in this algebra are, for example, the 


4) 
operators q; of multiplication by x; (with q,(f) = x,f) and p; = ax. It is easy to 
Xx: 


J 
see that it has the defining relations 


PiPj = PjPi 9:9; = Vj Ii (9) 
Pi4;=49jP, WiFj, and pq; — 4;p; = 1. 


We apply this construction to some other important classes of algebras. 
Suppose given an n-dimensional vector space L and a symmetric bilinear form, 
which we denote by (x, y). We consider the algebra having generators in 1-to-1 
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correspondence with the elements of some basic of L (and denoted by the same 
letters), and relations of the form 


xy + yx =(x,y) forx, ye L. (10) 


Thus our algebra is the quotient algebra of the tensor algebra T(L) by the ideal 
I generated by the elements (x, y) — xy — yx. 
We consider two extreme cases. 


Example 8. Suppose that the bilinear form (x, y) is identically zero. Then it 
follows from the relation (10) that x? = 0 (if the characteristic of K is 42; if 
char K = 2 then we need to take x? = 0 as part of the definition). Any element 
of the algebra constructed in this way is a linear combination of products 
e;,'--.°e;, of basis elements of L. It is easy to see that all such products generate 
the space /\’(L) (see §5 for the definition of the module /(M)). The whole of 
our algebra is represented as a direct sum A°(L) @ A\\(L)®-:: ® A"(L). This 
algebra is graded and of finite rank 2”; it 1s called the exterior algebra of L and 
denoted /\(L); multiplication in /\(L) is denoted by x A y. 

It is easy to see that if x e \’(L) and y e A°(L) then 


xAy=yAx_ ifeitherr ors is even (11) 
and 
xAy=—yaAx_ if bothr and s are odd. 
This can be expressed in another way. Write /\(L) = R, and set 
NL) ® A2(L) @ ML) @ = RK, AL) ® AX(L)@-: = Rl. 
Then R = R®° @ R’, and 
R°-R°< R®°, R°-R'cR}, R'-R°cR’, R!-R'c R®. (12) 


A decomposition with properties (12) is called a Z/2-grading of the algebra R. 
For R = /\(L) wecan state (11) by saying thatx A y= y A xifeither x ory e R®, 
andx A y= —y A xifbothx and ye R’. Analgebra with a Z/2-grading having 
these properties is called a superalgebra. The exterior algebra /\(L) is the most 
important example of these. Interest is superalgebras has been stimulated by 
quantum field theory. On the other hand, it turns out that purely mathematically, 
they form a very natural generalisation of commutative rings, and can serve 
as the basis for the construction of geometric objects, analogues of projective 
space (superprojective spaces) or of differential and analytic manifolds (super- 
manifolds). This theory has applications to supergravitation in physics, and it is 
studied by supermathematicians. 


Example 9. The definition of exterior algebra used a basis of the vector space 
L (the vectors x, y in (10) belong to it). Of course, the construction does not 
depend on this choice. We can give a completely intrinsic (although less eco- 
nomical) definition, taking x and y in (10) to be all the vectors of L. It is easy 
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to see that we arrive at the same algebra. In this form the definition is applicable 
to any module M over acommutative ring A. We obtain the notion of the exterior 


algebra of a module: 
\M=@N\M 


If M has a system of n generators then /’ M = 0 for r > n. In particular, the 
exterior algebra of the module of differential 1-forms Q* on a n-dimensional 


differentiable manifold is called the algebra of differential forms, Q = GM’. 


rain 


We will see later important applications of the exterior product of forms. 


Example 10. Now consider the other extreme case, when the bilinear form (x, y) 
in (10) is nondegenerate and corresponds to a quadratic form F(x), that is, 
F(x) = 4(x, x) (we suppose that the characteristic of K is #2). We can argue in 
this case just as in the previous one, exchanging factors in the product e; -...°é;, 
using (10). The only difference is that for j < i the product ee; gives rise to two 
terms, one containing — e,e, and one containing (e;, e,), giving a product of r — 2 
factors. As a result we prove in exactly the same way that the products e;,-...-e;, 
with i, <-:: < i, form a basis of our algebra, so that it is again of rank 2”. This 
algebra is called the Clifford algebra of the vector space L and the quadratic form 
F, and is denoted by C(L). The significance of this construction is that in C(L) 
the quadratic form F becomes the square of a linear form: 


F(x ,e, Hot + Xp_@,) = (Xp ey Ht + Xpe,)?- (13) 


Thus the quadratic form becomes ‘a perfect square’, but with coefficients in some 
noncommutative algebra. Suppose that F(x,,...,x,) =x? + x?; then by (13), 
x? +++ +x? =(x,e, +°*' +x, e,)*. Using the isomorphism between the ring 


; ; ; i) 7) 
of differential operators with constant coefficients R Ee beng 5, and the poly- 
yi n 
nomial ring R[x,,...,x,], we can rewrite this relation in the form 
0? 0? 7) 7) 2 
ap 7 * Oy (5 ot OY, ) U9 


It was precisely the idea of taking the square root of a second order operator 
which motivated Dirac when he introduced a notion analogous to the Clifford 
algebra in his derivation of the so-called Dirac equation in relativistic quantum 
mechanics. 

The products e; -...:e; with an even number r of factors generate a subspace 
C° of the Clifford algebra C, those with odd r a subspace C!; clearly, dim C° = 
dim C! = 2""!. It is easy to see that C = C°@C' and that this defines a Z/2- 
grading. In particular C° is a subalgebra of C, called the even Clifford algebra. 

Consider the map which sends a basis element e; -...:e,; of C(L) into the 
product e, -...:e;, in the opposite order. It is easy to see that this gives an 
involution of C(L), which we denote by ar a*. 
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Example 11. Suppose that F(x,, x.) = x? + x3 and that K = R. Then C(L) 
has rank 4 and basis 1, e,, e,, €,e, with e? = e? = 1 and e,e, = —e,e,. It is easy 
to see that C(L) = M,(R). For this, we set 
—€2 — €1€2 


y) 9 


Ey = 


and we then need to see that the elements E,, multiply together according 
to rule (2). The isomorphism of C(L) with M,(R) sends e, to the matrix 


0 det 0-1 Then by (14) the Laplace operator + a 

0-1 and e, to 4 o |: en by p p ax? * By? 
, 1 ON is, 0 —1]0a\ 

can be written as (F _ et a \ sls) . If the operator 9 = 


I 0} ¢ + 0 -T)e acts on the column |" | then the equation 
0 —1 j0x —1 0 | dy v , 


G “| = 0 gives: 
v 


du ov Ou Ov 
dx Oy dy ax’ 


that is, the Cauchy-Riemann equations. 

Now suppose that F(x,,...,x2,) =x? +°::+x3,. We divide the indexes 
1,..., 2n into n pairs: (1, 2), (3, 4), ..., (2n — 1,2n), and write a, B, etc. to denote 
any collection (i,,...,i,) of indexes such that i, belongs to the pth pair. If 
a =(i,,...,i,) and B = (j,,...,j,) then we set 


i2j2 


isis Ei in’ 


where the E;; are expressed in terms of e,, e; as in the case n = 1. It is easy to see 
that the E,, again multiply according to rule (2), that is C(L) = M,,(R). 

Example 12. If F(x,,x2,x3) = x7 + x3 + x3and K = R, then the even Clifford 
algebra C°(L) is isomorphic to the quaternion algebra: e,e,, e,€3, e,e3 multi- 
ply according to the rule of Example 5. 

In the commutative case fields can be characterised as rings without ideals 
(other than 0). In the noncommutative case, as usual, the relation is more 
complicated. One proves just as in the commutative case that the absence of left 
ideals (other than 0) is equivalent to the fact that every element other than 0 has 
a left inverse (satisfying a~'a = 1), and right ideals relate to right inverses in the 
same way. Thus division algebras are the rings without left ideals (or without 
right ideals), other than 0. 

What does the absence of two-sided ideals correspond to? A ring not having 
any two-sided ideal other than 0 is said to be simple. We will see later the 
exceptionally important role played by simple rings in the theory of rings, so that 
they can, together with division algebras, be considered as a natural extension 
of fields to the noncommutative domain. 
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Example 13. We determine the structure of left ideals of the ring R of linear 
transformations of an n-dimensional vector space L over a division algebra D. 
Let us show that the construction given earlier (the ideals ,J and I,,) describes 
all of these, restricting ourselves to left ideals. Let J < R be such an ideal, and 
V < L the subspace consisting of all elements x € L such that g(x) = 0 for all 
gel. If g,,..., @ is a basis of I as a vector space over D then (| Ker g; = V. It 
is easy to see that if gm € R satisfies Ker g = V then any linear transformation 
gy’ for which Ker gq’ > V can be expressed in the form W@ with We R. It fol- 
lows easily from this that if g@,, g, € I then I contains a transformation @ for 
which Ker @ = Kerg, ™ Kerg,. Applying this remark to the transformations 
Qi, ---5 Q,, We find an element @ eI such that Ker@ = V, and by what we 
have said above, this implies that all transformations @ with g(V) = 0 are con- 
tained in J, that is, I = {g|g(V) = 0} = I. Right ideals are considered in a simi- 
lar way. 

Suppose finally that J is a two-sided ideal. As a left ideal, J corresponds to 
some subspace V, such that I = {g|g(V) = 0}. Take x e V withx 40. Forge! 
we have ¢(x) = 0. Since J is a right ideal, for any W € R we have gw e€ I and hence 
o(y(x)) = 0. But we could take (x) to be any vector of L, and hence J = 0. Thus 
a ring R isomorphic to M,(D) is simple. 

Another example of a simple ring is the ring R of differential operators with 
polynomial coefficients. To keep the basic idea clear, set n = 1. Interpreting p as 


d , 
the operator Ix” it is easy to check the relation pf(q) — f(q)p = f'(q). If 9 = 
x 


)_ f(q)p' is contained in a two-sided ideal I and 9 # 0 then on passing to the 
expression pY — Dp a number of times, we find an element 4 €/ which is a 
nonzero polynomial in p with constant coefficients, 4 = g(p). Since the relations 
(9) do not distinguish between p and gq, we have the relation g(p)q — qg(p) = g'(p). 
Composing expressions of this form several times, we find a nonzero constant in 
I. Hence I = R. (The validity of these arguments requires that the coefficient field 
has characteristic zero.) 


Example 14. Algebras which are close to being simple are the Clifford algebras 
C(L) and C°(L) for any vector space L with a quadratic form F (we assume that 
the ground field has characteristic #2). The following results can be verified 
without difficulty. The algebra C(L) is simple if n = 0 mod 2 (where n = dim L), 
and then Z(C(L)) = K. The algebra C°(L) is simple if n = 1 mod 2, and then 
Z(C°(L)) = K. The remaining cases are related to properties of the element 
z=e,...e, € C(L), where e,,..., e, 18 an orthogonal basis of L. It is easy to see 
that z is contained in the centre of C(L) ifn = 1 mod 2 and in the centre of C°(L) 
if n = Omod 2, and in both cases the centre of these algebras is of the form 
K + Kz.Thenz” =ae K anda =(—1)"*Q9 ora = 2:(—1)""?Q depending on 
the parity of n; Z denotes the discriminant of the form F with respect to the basis 
e,,...,é@,- fais not a square in K then K + Kz=K (/a) and the corresponding 
algebra is simple with centre K (./a). If a is a square then the algebra K + Kz is 
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isomorphic to K ® K and the corresponding Clifford algebra is isomorphic to 
the direct sum of two simple algebras of the same rank with centre K. 


§9. Modules over Noncommutative Rings 


A module over an arbitrary ring R is defined in the same way as in the case of 
a commutative ring: it is a set M such that for any two elements x, y e M, the 
sum x + y is defined, and for x €e M and ae R the product ax € M is defined, 
satisfying the following conditions (for all x, y, z € M, a, b € R): 


XtyrHytx; 
(x+y)+z=x+(y4+2); 


there exists Oe M such thatO + x=x+0=x; 
there exists — x such that x + (—x) = 0; 


(1) 


1-x =X; 
(ab)x = a(bx); 
(a + b)x = ax + bx; 


a(x + y) = ax + ay. 

In exactly the same way, the notions of isomorphism, homomorphism, kernel, 
image, quotient module and direct sum do not depend on the commutativity 
assumption. A ring R is a module over itself if we define the product of a (as an 
element of the ring) and x (as an element of the module) to be equal to ax. The 
submodules of this module are the left ideals of R. If I is a left ideal then the 
residue classes mod J form a module R/I over R. The multiplication of x (as an 
element of the module) on the right by a (as an element of the ring) does not 
define an R-module structure. In fact, if we denote temporarily this product by 
{ax} = xa (on the left as in the module, on the right as in the ring) then 
{(ab)x} = {b{ax}}, which contradicts the axioms (1). However, we can say that 
in this way R becomes a module over the opposite ring R’. In this module over 
R’, the submodules correspond to right ideals of R. 

The most essential examples of modules over noncommutative rings are first 
of all the ring itself and its ideals as modules over the ring. We will see shortly 
just how useful it is to consider these modules for the study of the rings them- 
selves; and secondly, the study of the many modules over group rings is the 
subject of group representation theory, which will be discussed in detail later. 

If R is an algebra over a field K then every R-module M is automatically a 
vector space over K (possibly infinite-dimensional). The module axioms show 
that for any ae R the map 9,(x) = ax for x € M is a linear transformation of 
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this vector space. Moreover, sending a to the linear transformation gq, is a 
homomorphism of the algebra R into the algebra End, M of all linear trans- 
formations of M as a K-vector space. Conversely, a homomorphism R > End, L 
(denoted by at q,) into the algebra of linear transformations of a vector space 
L obviously defines an R-module structure on L: 


ax=@,(x) for aeRandxeL. (2) 


In this situation, we sometimes use somewhat different terminology from that 
usual in the general case. In view of the especial importance of this special case 
we repeat, in the new terminology, the basic definitions given above in the general 
case. 


Restatement of the Definition of a Module. A representation of an algebra R 
over a field K on a K-vector space L is a homomorphism of R to the algebra 
End, L of linear transformations of L. 

In other words, a representation of R sends each element ae R to a linear 
transformation g, so that the following conditions are satisfied: 


go, =E (the identity transformation); (3) 
Poo = 49, forae K andaeR; (4) 
Path = Ya +O, fora, be R; (5) 
Pap = Va?, fora, be R. (6) 


Restatement of the Definition of a Submodule. A subrepresentation is a subspace 
V c Linvariant under all transformations g, for a € R, with the representation 
of R induced by these transformations. 


Restatement of the Definition of a Quotient Module. The quotient representa- 
tion by a subrepresentation V c L is the space L/V with the representation 
induced by the transformation g,. 

If R is an algebra of finite rank over a field K with basis 1 = e,,..., e, and 
multiplication table e,e, = ) c,,e, then conditions (3)—(6) in the definition of a 


representation reduce to specifying transformations @, , ..., @,, satisfying the 
relations 

YP; = E, (7) 

Pe; Pe, = » Ciik Pe,,: (8) 


If R = K[G] is the group algebra of a finite group G then conditions (7), (8) 
take the form 


YP; = E, (9) 
Pg, Pq. = Pa, Po, (10) 


Conditions (9)—(10) guarantee that all transformations are invertible. 
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If G is an infinite group and the group algebra is defined as the set of linear 
combinations of elements of the group (see § 8, Example 4), then a representation 
is given by the same conditions (10). If the group algebra is defined as the algebra 
of functions on the group with the convolution operation, the elements of the 
group G are only contained in it as delta-functions, and hence the operators @, 
may not exist. On the other hand, if operators g, satisfying (9) and (10) exist, then 
the operator y, corresponding to a function f can be defined as the integral 
of operator functions f(g)g, taken over the whole group. Hence for group 
representations, conditions (9) and (10) give more than conditions (3)—(6) for the 
group algebra, and we take conditions (9) and (10) as the definition of a group 
representation. 

If the module M in which the representation of an algebra R is realised is 
finite-dimensional over a field K then we say that the representation is finite- 
dimensional. In this case the linear transformations g, are given by matrixes (once 
a basis of M has been chosen). Let us reformulate once more the basic notions 
of representation theory in this language. 

A finite-dimensional representation of an algebra R is a homomorphism R > 
M,(K), which assigns to each element ae R a matrix C, € M,(K) satisfying the 
conditions: 


C, =E, 
Coa = aC,, 
Cary = Ca + G,, 
Ca = C,G.- 
In the case of a representation of a group G these conditions are replaced by 
C, = E, 
Cy.a, = Co, Ga, 


Restatement of the Notion of Isomorphism of Modules. Two representations 
ar>C, and at C,, are equivalent if there exists a nondegenerate matrix P such 
that C, = PC,P™' for every ae R. 


Restatement of the Notion of Submodule. A representation at+ C, has a sub- 
representation a+> D, if there exists a nondegenerate matrix P such that the 
matrixes C, = PC,P™' are of the form 


D, § 
ya f 7 4 I. 11 
OF i n| ( ) 


Restatement of the Notion of Quotient Module. The matrixes F, in (11) form 
the quotient representation by the given subrepresentation. 


Restatement of the Notion of Direct Sum of Modules. If S, = 0 in (11) we say 
that the representation C, is the direct sum of the representations D, and F,. 
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Considering in particular an algebra R as a module over itself, we get an impor- 
tant representation of R. It is called the regular representation. If R has a finite 
basis €,,..., @, with structure constants c,,, then the element «,e, + °° + en 
corresponds in the regular representation to the matrix (p,,), where pj, = » Cin i 


We now return to modules over an arbitrary ring R (not necessarily an algebra) 
and consider an important condition having the character of finite dimensionality. 
This relates to the definition of the dimension of a vector space as the greatest 
length of a chain of subspaces. 

We define the length of an R-module M to be the upper bound of the length 
r of chains of submodules: 


M=M)?M, 2° 2M,=0. 


Of course, the length of a module may be either finite or infinite. Consider a 
module of finite length r and a chain M = M, 2 M, 37°:: 2 M, = 0 of maximal 
length. If the module M,;/M;,, contained a submodule N distinct from M;/M;., 
itself and 0, then its inverse image M’ under the canonical homomorphism 
M; > M;/M,,, would be a submodule M; 2 M’ 2 M,,,; substituting this into 
our chain, we would get a longer chain. Hence there can be no such submodule 
in M,/M,,,. We thus arrive at a very important notion. 

A module M is simple if it does not have any submodules other than 0 and 
M.A representation of an algebra (or of a group) is irreducible 1f the correspond- 
ing module is simple. 

Simpleness is a very strong condition. 


Example 1. In the case of vector spaces over a field, only 1-dimensional spaces 
are simple. 


Example 2. Let L be a finite-dimensional complex vector space with a given 
linear transformation ¢, considered as a module over the ring C[t] (§ 5, Example 
3). Since g always has an eigenvector, and therefore a 1-dimensional invariant 
subspace, L is again simple only if it is 1-dimensional. 


Example 3. Consider a ring R as a module over itself. To say that R is simple 
means that R does not have left ideals, that is, R is a division algebra. 

Let M and N be two simple modules and g: M > N a homomorphism. By 
assumption, Ker g = 0or M,andIm@ = OorN. If Kerg = M orIm@ = Othen 
g is the zero homomorphism. In the remaining case Kerg = 0 and Img = N, 
so that ~ is an isomorphism. Thus we have the following result. 


I. Schur’s Lemma. A homomorphism of one simple module into another is either 
zero or an isomorphism. 


We return to the notion of length. We have seen that if M is a module of length 
r then in a chain M = My, 2 M, 2-:: 2 M, = 0 each of the modules M;/M;., 
must be simple. 


78 §9. Modules over Noncommutative Rings 


A chain M=M,7M,27°::'2M,=0 in which M,/M;,, is simple is 
called a composition series. The following result holds. 


II. The Jordan-H6lder Theorem. All composite series of the same module have 
the same length (and in particular they are either all finite or all infinite). If they 
are finite then the successive quotients M,/M,,, appearing in them are isomorphic 
(but possibly occuring in a different order). 


Thus in a module of finite length the longest chains of submodules are exactly 
the composition series. 

We extend the notion of length to rings. The length of a ring R is its length as 
an R-module. Thus a ring has length r if it has a chain R 2 I, 2--- 2 I, =0 of 
left ideals and no longer chains. 

We have already seen that a division algebra is a ring of length 1. Left ideals 
of a matrix ring M,(D) over a division algebra D correspond to linear subspaces 
of the n-dimensional space L over the opposite division algebra D’ (§8, Example 
13). Hence the length of M,(D) is n. 

Of course, if a ring R is an algebra over a field K then the length of R does not 
exceed its rank over K. 

A module of finite length is finitely generated (or is of finite type), by analogy 
with the property of Noetherian modules in the case of commutative rings. 

If a ring R has finite length then the length of any finitely generated R- 
module is also finite. This follows from the fact that if elements x,,..., x, generate 
a module M then M is a homomorphic image of the module R” under the 
homomorphism (a,,...,a,)+> @,X, + °°: + a,x,, and is of finite length as a quo- 
tient of a module of finite length. 

In conclusion, let us consider in more detail a notion which we have often met 
with in the theory of modules over a commutative ring. 

A homomorphism of an R-module M to itself is called an endomorphism. All 
the endomorphisms of a module obviously form a ring; it is denoted by End, M. 
An important difference from the commutative case is that it is not possible to 
define the multiplication of an endomorphism @g € End, M by an element ae R. 
The map x+~ ag(x) 1s not in general an endomorphism over R; that is, multiplica- 
tion of endomorphisms by elements of R is not defined in End, M. 


Example 4. Consider the ring R as an R-module. What is the ring of endomor- 
phisms End, R of this module? By definition, an endomorphism ¢ 1s a map of 
R to itself which satisfies the conditions 


p(x + y) = v(x) + e(y), (ax) = ag(x), for a, x, y, € R. (12) 


Setting g(1) = f we get from (12) for x = 1 that (a) = af for any ae R. Thus 
any endomorphism is given as right multiplication by an element f € R. It follows 
from this that the ring End, R is the opposite ring of R. 


Example 5. Suppose that a module M is isomorphic to a direct sum of n 
isomorphic modules P, that is, M = P". Then M consists of n-tuples (x,,...,X,) 


§ 10. Semisimple Modules and Rings 79 


with x; € P. The situation is exactly the same as in describing linear transforma- 
tions of a vector space, and the answer is entirely analogous. For @ € Endp M 
and x € P, set 


p((0, rasa Xyeeey 0)) = (Wis (x), vse) Win(X)) 


(where x is in the ith place on the left-hand side). Here w,; are homomorphisms 
of P to P, that is w,,¢ End, P. Replacing ¢ by the matrix (W,;) with entries in 
End, P, we get an isomorphism 


End, P" = M,(End, P). 


In the particular case that P is the division algebra D as module over itself, we 
arrive (using the result of Example 4) at the expression found earlier (§ 8, Theorem 
1) for the ring of linear transformations over a division algebra. 


Example 6. The ring Endz M of endomorphism of a simple module M is a 
division algebra. This is an immediate consequence of Theorem I. 


§10. Semisimple Modules and Rings 


The theory of modules over noncommutative rings, and the study of the 
structure of the rings themselves, can be taken well beyond the framework of 
general definitions and almost obvious properties treated in the preceding sec- 
tion, provided we restrict ourselves to objects satisfying the strong, but frequently 
occuring property of semisimpleness. 

A module M is semisimple if every submodule of M is a direct summand. This 
means that for any submodule N c M, there exists another submodule N’ c M 
such that M=NQ@QN’. 

Obviously, a submodule, holomorphic image and a direct sum of semisimple 
modules are semisimple. A simple module is semisimple. Any module of finite 
length contains a simple submodule; hence a semisimple module of finite length 
is a direct sum of simple modules. It follows from the Jordan-Holder theorem 
(or can be deduced even more simply from § 9, Theorem I) that the decomposition 
of a semisimple module as a sum of simple modules is unique (that is, the simple 
summands are uniquely determined up to isomorphism). The number of these 
summands Is the length of the module. 

If P <M is simple and N c M is any submodule then either P c N or 
PON = 0. From this we deduce the following: 


Theorem I. If a module is generated by a finite number of simple submodules 
then it is semisimple and of finite length. 


In fact, if P,, ..., P, are simple submodules generating M and N < M but 
N # M, then there exists P, ¢ N. Then P; 7 N = 0 and the submodule generated 


80 § 10. Semisimple Modules and Rings 


by P, and N 1s isomorphic to the direct sum P, ® N. Applying the same argument 
to this, we arrive after a number of steps at a decomposition M = N@N'. 


Example 1. Let M be a finite-dimensional vector space L with a linear trans- 
formation ¢, considered as a module over K[t] (see § 5, Example 3). If M is simple, 
then Lis a 1-dimensional vector space (§9, Example 1). Hence M is semisimple 
if and only if L can be written as a direct sum of 1-dimensional invariant 
subspaces, that is, @ can be diagonalised. Semisimpleness in the general case is 
also close in meaning to the ‘absence of nondiagonal Jordan blocks’. 


Example 2. Suppose that M corresponds to a finite-dimensional representation 
g of an algebra R over the field C; assume that M as a vector space over C has 
a Hermitian scalar product (x, y), and that the representation @ has the property 
that for all ae R there exists a’ € R such that o* = g,, (where g* denotes the 
complex conjugate transformation). Then M is a semisimple module. 

Indeed, if N < M is a subspace invariant under the transformations g, for 
a eé R, then its orthogonal complement N’ will be invariant under the transforma- 
tions p*, so by the assumption, under all transformations g,. Hence M = N @ N’ 
as an R-module. 


Example 3. Suppose that M corresponds to a finite-dimensional representation 
gy of a group G over C, again with a Hermitian scalar product defined such that 
all the operators @, are unitary for g € G, that is 


(P(x), Pg(y)) = (x, y). (1) 


Then M is semisimple. The proof 1s the same as in Example 2. 

The notion of semisimpleness extends to infinite-dimensional representations 
of groups, with the modification that the module M as a vector space over C is 
given a topology or a norm, and the submodule N is assumed to be closed. In 
particular, if in the situation of Example 3 M is a Hilbert space under the 
Hermitian product (x, y) then the argument still works. 


Example 4. A finite-dimensional representation of a finite group G over C 
defines a semisimple module. 

The situation can be reduced to the previous example. Introduce on M 
(considered as a vector space over C) an arbitrary Hermitian scalar product 
{x, y}, and then set 


1 
IG| d {Q,(X), pl y)} (2) 


where the sum runs over all elements g € G, and |G| denotes the number of 
elements of G. It is easy to see that the product (x, y) satisfies the conditions of 


Example 3. 
The same argument can be adapted to representations over an arbitrary field. 


(x, y) = 
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Example 5. If G is a finite group of order n not divisible by the characteristic 
of a field K, then any finite-dimensional representation of G over K defines a 
semisimple module. 

Let M be the space in which the representation g acts, and N < M asubmodule. 
We choose an arbitrary subspace N’ such that M = N @ N’ (as a vector space). 
Write x’ for the projection onto N parallel to N’; that is, ifx = y + y’ withx e M, 
yéN and y’ € N’ then z'(x) = y. We consider the linear transformation 


n=- Y @;'n'q 3) 
n géeG 

It is easy to check that tM c N, nx = x for xe N and 9,x = nq, for ge G. It 

follows from this that x is the projection onto N parallel to the subspace 

N, = Kerz and that N, is invariant under all @, for g € G; that is, it defines a 

submodule N, < M such that M=N@Q@N,. 

We carry over the notion of semisimpleness from modules to rings. A ring R 
is semisimple if it is semisimple as a module over itself. From Examples 4—5 it 
follows that the group algebra of a finite group G over a field K is semisimple if 
the order of the group is not divisible by the characteristic of the field. 


Theorem II. A simple ring R (see § 8) of finite length is semisimple. 


In fact, consider the submodule J of R generated by all simple submodules. 
From the fact that R is of finite length it follows that J is generated by a finite 
number of submodules P,,..., P,. Obviously J is a left ideal of R. But I is also a 
right ideal, since for all ae R, P.ais a left ideal and a simple submodule, that is 
P,a < I, and hence Ia c I. Since R is simple, J = R, that is R is generated by 
simple submodules P,, and is semisimple because of Theorem I. 

The theory of modules over semisimple rings has a very explicit character. 


Theorem III. [f R is a semisimple ring of finite length and 
R = P, () “ee >) P,, 


is a decomposition of R (as a module over itself) as a direct sum of simple 
submodules then the P, fori = 1,..., n are the only simple modules over R. Any 
module of finite length is semisimple and is a direct sum of copies of certain of the 
modules P.. 


In fact, if M is any module and x,,..., x, are elements of M then we can define 
a homomorphism f: R“ — M by 


Ff((a,,.--,a)) —= A,X, to AyX,- 


From the fact that M is of finite length it follows that for some choice of the 
elements x,,..., x, the image of f is the whole of M. Thus M is a homomorphic 
image of a semisimple module, and is therefore semisimple. If M is simple, 
rewriting R‘ in the form Pf ® -:: ® Pk we see that f(P,) = Oif P, is not isomorphic 
to M, and hence P, = M for some i. 
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Corollary. In particular, we see that over a semisimple ring of finite length there 
are only a finite number of simple modules (up to isomorphism). 


We now turn to the description of the structure of semisimple rings of finite 
length. As a module over itself, such a ring decomposes as a direct sum of simple 
submodules: 


R=P,0:::@P, (4) 


In this decomposition we group together all the terms which are isomorphic to 
one another as R-modules: 


R=(P,®°' OP.) © (Pi, 41 BOP) OOP, 41 OOP), 
that is, 


R=R,@®R,0°:'' @R,, (5) 
where 
R= YP (6) 
ki sj <ki+t 


here in (6) all the simple submodules P, for k; < j <k;,, occuring in the 
same summand R;, are isomorphic, and all those occuring in different R; are 
nonisomorphic. 

Any simple submodule P < R is isomorphic to one of the P,, and it is not hard 
to deduce from this that it is contained in the same summand R,; as P.. In 
particular, for a € R the module P,a is isomorphic to P,, and hence P,a c R; if 
P, < R;. In other words, the R; are not only left ideals (as they must be as 
R-submodules), but also right ideals, that is, they are two-sided ideals. It follows 
from this that R;R; c R,; and R,;R; < R;, and hence R;R,; = 0 for i # j. It is easy 
to see that the components of 1 in R; in the decomposition (5) is the unit element 
of R;, so that we have a decomposition as a direct sum of rings 


R=R,@®R,@°:'@OR,. 


Here the R; are defined entirely uniquely: each of them is generated by all simple 
submodules of R isomorphic to a given one. 

Consider now one of the rings R;. For this it is convenient to rewrite the 
decomposition (6) in the form 


R; = Pi OP,20°° OP. a, 


where the prime submodules P, ; are all isomorphic, that is, R; = N;" for some 
module N; isomorphic to all of the P, , for j = 1,..., qj. 

Let us find the ring of automorphisms of this module over R. Since R,R; = 0 
for s #i, we have End, R; = Endy, R;, and thus End, R; is isomorphic to the 
opposite ring R; skew-isomorphic to R;, (see §9, Example 4). On the other hand, 
by §9, Example 3, Endz R; = M,,(End, N;), and End, N; is a division algebra D, 
(§9, Example 6). Hence R; = M,,(D;), and 
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R, = M,,(D)). (7) 


We have seen (§ 8, Example 13) that M,(D) is a simple ring, and hence the rings 
R; in (7) are simple. Putting the decomposition (5) together with the isomorphism 
(7), we obtain the following fundamental theorem. 


IV. Wedderburn’s Theorem. A semisimple ring of finite length is isomorphic to 
direct sum of a finite number of simple rings. A simple ring of finite length is 
isomorphic to a matrix ring over some division algebra. 


Conversely, we have seen (§ 8, Example 13) that for D a division algebra, the 
ring M,(D) is simple, and it is easy to see that a direct sum of semisimple rings 
is semisimple. Hence Wedderburn’s theorem describes completely the range of 
the class of semisimple rings: they are direct sums of matrix rings over division 
algebras. As a particular case we get: 


Theorem V. A commutative ring of finite length is semisimple if and only if it 
is a direct sum of fields. 


For an arbitrary semisimple ring R, its centre Z(R) is commutative. It is easy 
to see that Z(R, ® R,) = Z(R,) ® Z(R,) and that Z(M,(D)) = Z(D). 


Corollary. The centre of a semisimple ring R is isomorphic to a direct sum of 
fields, and the number of direct summands of the centre is equal to the number of 
direct summands in the decomposition (5) of R as a direct sum of simple rings. 


In particular, we have the result: 


Theorem VI. A semisimple ring of finite length is simple if and only if its centre 
is a field. 


We now illustrate the general theory by means of the fundamental example of 
the ring M,(D). In §8, Example 13 we saw what the left ideals of R = M,(D) 
look like for a division ring D. Any left ideal of this ring is of the form ,J = 
{a € R|jaV = 0}, where a is considered as a D’-linear transformation of a n- 
dimensional vector space L (over the opposite division algebra D’), and V c L 
is some vector subspace. Simple submodules correspond to minimal ideals. 
Obviously, if V < V’ then ,J > ,-I. Hence we obtain a minimal ideal ,J is we 
take V to be an (n — 1)-dimensional subspace of L. Choose some basis e,,..., e, 
of L and let V, be the space of vectors whose ith coordinate in this basis equals 
zero; the ideal , J consists of matrixes having only the ith column nonzero. The 
decomposition R = N, @-::@ N, corresponds to the decomposition 

au _ An d,, 0 -- 0 0 -- 0 di, 
: 2 | 7) 
i d,, O -- 0 0 -- O d, 


of a matrix. On multiplying this on the left by an arbitrary matrix, the ith column 
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transforms exactly as a vector of L. Thus all the ideals N, are isomorphic as 
R-modules, and are isomorphic to L. This is the unique simple module over R. 

In the case of an arbitrary semisimple algebra R of finite length, the position 
is only a little more complicated. If R decomposes as a direct sum 


R= M,,(D,) © _— © M,,,(Dp) 


of p matrix rings then it has p simple modules N,, ..., N,, where N;, is a 
n;-dimensional vector space over the opposite division algebra D;, and R acts on 
N; as follows: M,,(D;) annthilates N; for j 4 i, and matrixes of M,,(D;) act on N, 
as befits a matrix acting on a vector. 

The remaining part of this section is devoted to examples and applications of 
the theory we have treated. 

We return to an arbitrary simple ring of finite length, which by Wedderburn’s 
theorem is isomorphic to a matrix ring M,(D) over some division algebra D. We 
have given a description of the left ideals of this ring: they are in 1-to-1 inclusion- 
reversing correspondence with subspaces of an n-dimensional subspace over the 
opposite division algebra D’ (that is, V, c V, if and only if ,,J > ,,/). To what 
extent is the whole ring (that is, the division algebra D and the dimension n) 
reflected by this partially ordered set? The set of linear subspaces of a n-dimensional 
vector space over D coincides with the set of linear subspaces of (n — 1)- 
dimensional projective space P"~'(D) over D. Thus our question is essentially a 
question about the axiomatic structure of projective geometry. We recall the 
solution of this problem. (The axioms we give are not independent; we have 
chosen them as the most intuitively convincing.) 

Let $8 be a partially ordered set, that is, for some pairs (x, y) of elements of $B 
an order relation x < yis defined, such that the following conditions are satisfied: 
(a) ifx < yand y <z then x < z; and (b) x < yand y < x if and only if x = y. 

We assume that the following axioms are satisfied: 

1. For any set of elements x, € $B there exists an element y such that y > x, 
for all a, and if z > x, for all « then z > y. The element y is called the sum of the 
x, and is denoted by |_) x,. In particular, the sum of all x € $B exists (the ‘whole 
projective space’). It is denoted by J or I(). 

2. For any set of elements x, € $B there exists an element y’ such that y’ < x, 
for alla, andifz’ < x,for allathenz’ < y’. Theelement y’ is called the intersection 
of the x, and is denoted by ( ) x,. In particular, the intersection of all x € $8 exists 
(the ‘empty set’). It is denoted by @ or O(P). 

From now on, for x, ye $% with y < x we write x/y for the partially ordered 
set of all z € % such that y < z < x. Obviously conditions 1 and 2 are satisfied 
in x/y for all x and y. 

3. For any x and ye $ and ae x/y there exists an element b € x/y such that 
au b = I(x/y) and anb = @(x/y). If b' € x/y is another element with the same 
properties and if b < b’ then b = D’. 

4. Finite length: the length of all chains a, <a, <-:: <a, with a, 4a), 
a, #a;,...,a,-,; # a, is bounded. 
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An element a é R is a point if b < a and b ¥ a implies that b = @. 

5. For any two points a and b there exists a point c such that c 4 a,c # band 
c<aub. 

A partially ordered set satisfying conditions 1—5 is called a projective space. It 
can be proved that the maximum length of a chain starting from @ and ending 
with a € $B defines a dimension function d(a) on points, which satisfies the relation 


d(a nm b) + d(au b) = d(a) + d(b). 


The number d(J) is called the dimension of $8. An example of a n-dimensional 
projective space is the space P"(D) of all linear subspaces of a(n + 1)-dimensional 
vector space over a division algebra D. 

We have the following result. 


VII. Fundamental Theorem of Projective Geometry. (a) For n > 2 the projec- 
tive space P"(D) (as a partially ordered set) determines the number n and the 
division algebra D; and (b) if SB is a projective space of dimension at least 3 then 
it isomorphic (as a partially ordered set) to the projective space P"(D) over some 
division algebra D. 


The proof is based on artificially introducing a system of coordinates in the 
projective space (that is, ‘coordinatising’ it); the basic idea is already present in the 
calculus of plane intervals (see § 2, Figures 5 and 6). As in this calculus, the set of 
elements which appear as coordinates can be constructed fairly easily. On this 
set one defines operations of addition and multiplication; but the hard part is 
verifying the axioms of a division algebra. The key to it is the following assertion, 
known as ‘Desargues’ theorem’: 


VIII. Desargues’ Theorem. If the 3 lines AA’, BB’, CC’ joining corresponding 
vertexes of two triangles ABC and A’B'C’ intersect in a point O then the points of 
intersection of the corresponding sides are collinear (see Figure 11). 

However, this assertion can only be deduced from the axioms of a projective 
space if the space has dimension > 3. In 2 dimensions, that is in the plane, it does 
not follow from the axioms, and not every projective plane is isomorphic to 
?(D). A necessary and sufficient condition for this is that the previous proposi- 
tion should hold, and one must add this as an extra axiom, Desargues’ axiom. 

The results we have given characterise the role of completely arbitrary division 
algebras in projective geometry: they allow us to list explicitly all the non- 
isomorphic realisations of the system of axioms of n-dimensional projective 
geometry (together with Desargues’ axiom if n = 2). As one would expect, alge- 
braic properties of division algebras occur as geometric properties of the cor- 
responding geometries. For example, the commutativity of a division algebra D 
is equivalent to the following assertion in the projective space P"(D) for n 2 2. 


IX. Pappus’ ‘Theorem’. If the vertexes of a hexagon P, P,P;P,P.P, lie 3 by 3 
on two lines, then the points of intersection of the opposite sides P,P, and P,Ps, 
P,P, and P;P., P;P, and P.P, are collinear (Figure 12). 
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Fig. 11 


Fig. 12 


The condition that the division algebra D is of characteristic 2 is equivalent to 
the following axiom: 


X. Fano’s Axiom. The 3 points of intersection of opposite sides AB and DC, AD 
and BC, AC and BD of a plane quadrilateral ABCD are collinear. 


In Figure 13 this property does not hold, since the real number field has 
characteristic # 2. 

The finite models of systems of certain geometric axioms which we considered 
in §1 (see Figures 1-2) relate to the same circle of ideas; they are finite affine 
planes over the fields F, and F,, that 1s, they are obtained from the projective 
planes P?(F,) and P?(F,) by throwing away one line and the points on it (which 
will then be ‘at infinity’ from the point of view of the geometry of the remaining 
points and lines). 

There are various infinite-dimensional generalisations of simple rings of finite 
length and of the theory treated above. One of these starts off from the semisimple 
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Fig. 13 


condition (Example 2) and the criterion VI for a semisimple ring to be simple. 
This leads to the following definition. 

A subring R of the ring of bounded operators on a complex Hilbert space is 
called a factor if together with an operator ¢ it contains the conjugate operator 
@*, the centre of R consists only of the scalar operators, and R is closed in the 
natural topology (the so-called weak topology). 

Similarly to the way in which a simple ring of finite length defines a projective 
space satisfying the axioms 1-5, any factor defines a partially ordered set satisfy- 
ing similar axioms. On this set a dimension function is again defined, but now 
various cases can occur: 


Case I,. The dimension takes the values 0, 1, 2, ..., n; then the factor is 
isomorphic to the ring M,(C). 

Case I,,. The dimension takes the values 0, 1, 2, ..., 00; in this case the factor 
is isomorphic to the ring of all bounded operators on an infinite-dimensional 
Hilbert space. 

Case II,. The dimension takes values in the interval [0, 1]. 

Case II,,. The dimension takes values in the interval [0, 00 J. 

Case III. The dimension takes only the values 0 and oo. 


The partially ordered sets corresponding to II,, II,, and III are highly non- 
trivial infinite-dimensional analogues of projective planes (we emphasise that the 
dimensions of subspaces can be any real values), called continuous geometries. 

From now on we restrict ourselves to considering algebras of finite rank over 
a field K. 


Example 6. We saw in § 8, Example 11 that if Lis a real 2n-dimensional vector 
space with metric x? + --- + x3, then the Clifford algebra C(L) is isomorphic to 
the matrix algebra M,,,(R). Therefore this algebra has a unique irreducible 
representation (up to equivalence), of degree 27". Thus we can write 

Xt tot Xan = (x1) bt Xonl on), 


where J;,..., /5, are 27" x 2?" matrixes, and these matrixes are uniquely deter- 
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mined up to transformations [;+> C/,C~!. No matrixes smaller than 27" x 27" 
can give such a representation. 

Now suppose that the field K is algebraically closed. Then the theory of 
semisimple algebras over K and their representations takes on an especially 
concrete character. The basis of this is the following simple result. 


Theorem XI. A division algebra of finite rank over an algebraically closed field 
K is equal to K itself. 


In fact if D has rank n and ae D then the elements 1, a, a’, ..., a" must be 
linearly dependent over K. Hence there exists a polynomial F € K[t] of degree 
<n, not identically 0, for which F(a) = 0. Since K is algebraically closed, F(t) = 
y]](t — a), so that [] (a — «,) = 0. Since D is a division algebra, we must have 
a — a; = 0 for some i, so that ae K. We thus get the following result. 


Theorem XI’. Over an algebraically closed field K, any simple algebra of finite 
rank is isomorphic to M,(K), and any semisimple algebra to a direct sum of such. 


Earlier, in Formula (7), we gave an explicit decomposition of the regular 
representation of the algebra M,(K) into irreducible representations. It follows 
from this that its regular representation is a sum of n equivalent n-dimensional 
representations (corresponding to n-dimensional space K" as a module over the 
matrix ring M,(K)). If R= M,(K)®-:-®M,,(K) then R has p irreducible 
n,;-dimensional representations N, (corresponding to the irreducible representa- 
tions of the matrix algebras M,,(K)), and the regular representation of R is of the 
form 

R=NU ONO: @ N5? where n; = dim N,. 


For the decomposition of the centre there is also only one possibility, namely 
Z(R) = K?. 
As a result, the theory of representations of semisimple algebras of finite rank 
over an algebraically closed field reduces to the following: 
Theorem XII. Every representation is a finite sum of irreducible representations. 


Theorem XIII. All irreducible representations are contained among the irreducible 
factors of the regular representation. The number of nonequivalent representations 
among these is equal to the rank of the centre of the algebra. 


Theorem XIV. Every irreducible representation is contained in the regular 
representation the same number of times as its dimension. 


XV. Burnside’s Theorem. The sum of squares of the dimensions of irreducible 
representations equals the rank of the algebra: 


n=npt-+n?. 


To specify concretely a representation g: R > M,(K) we use the traces of 
the matrixes g(a) for ae R. The function Tr(@(a)) is defined on R and is linear, 
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and is hence determined by its values on the elements of a basis of R. Since 
Tr(CAC™!) = Tr(A), equivalent representations have the same traces. Write 
T,,(a) for the function Tr(g(qa)). 

If a representation ¢ is a direct sum of two others, 9 = ~, ® g, then obviously 
Tr,(a) = Tr, (a) + Tr,,(a). The traces of irreducible representations are called 
the characters of the algebra R. Let y,(a), x2(@), ..., x(a) be the characters 
corresponding to the p irreducible representations @,,...,@,. Any representation 
gw decomposes as a direct sum of irreducibles, and if y; occurs m; times amongst 
these, then 


Tr(a) = m,x,(a) +--+ + m,x,(4). (8) 


We know that in the decomposition of R as a direct sum of simple algebras 
R=R,@®°:'@R,, the irreducible representation g, maps to zero in all the 
summands R; for i # j. Together with (8), this implies the following result. 


Theorem XVI. The representations of a semisimple algebra are uniquely deter- 
mined by their trace functions Tr, (a). 


The results we have given have a very large number of application, most 
significantly in the special case of a group algebra R = K[G]. We will meet these 
later, but for now we indicate a completely elementary application to the matrix 
algebra. 


XVII. Burnside’s Theorem. An irreducible subalgebra R of a matrix algebra 
End, L over an algebraically closed field K coincides with the whole of End, L. 


Proof. The hypothesis of the theorem means that if a representation @ of R in an 
n-dimensional space L is defined by an embedding R > End, L then g will be 
irreducible. For any x € L the map at» ax defines a homomorphism of R as a 
module over itself into the simple module L. Take for x the n elements e,,..., é, 
of a basis of L. We get n homomorphisms f;: R > L, or a single homomorphism 
f:R—-L”" given by f=/, +-:'-+ f,. If f(@ = 0 for some ae R then f(a) = 0, 
that is, ae, = 0 and ax = 0 for all x e L. But R c End L, and hence a = 0. Hence 
Rc L" as R-module. Since L is a simple module, it follows from this that R is 
semisimple as a module, and hence the algebra R is semisimple, and as a module 
R = L* for some k. But according to Theorem XV, k = n = dim L. Hence R has 
rank n’, and therefore R = End, L. 
As an illustration, we give a striking application of this result: 


XVIII. Burnside’s Theorem. If G is a group of n x n matrixes over a field K of 
characteristic 0, and if there exists anumber N > 0 such that g% = 1 for allgeG 
then G is finite. 


For the proof we will use some of the very elementary notions of group theory. 
Obviously, extending the field K if necessary, we can assume that it is algebraically 
closed. Write R for the set of all combinations of the form 
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019i a + 09; with a,€ K and g,€ G. 


Obviously, R is an algebra over K and R c M,(K). Consider first the case that 
R is irreducible. According to the preceding theorem of Burnside we then have 
R = M,,(K). By assumption, the eigenvalues of any element g € G are Nth roots 
of unity. Since the trace Tr(g) of ann x n matrix is the sum of n eigenvalues, Tr(g) 
can take at most N” values. It is easy to check that the bilinear form Tr(AB) on 
M,,(K) is nondegenerate. Since R = M,(K), there exists n? elements g,,...,9,2 € G 
which form a basis of M,(K). Let @1,-++5 €,2 be the dual basis with respect to the 


bilinear form Tr(AB).Ifg = )° «,e; is the expression ofan arbitrary element g € G 
i=1 


in this basis, then «; = Tr(gg;). Thus the coefficients a; can take only a finite 
number of values, and hence G is a finite group. 
If R is reducible then the matrixes corresponding to elements of G can simul- 


taneously be put in the form 
(“0 on 
0 Big) 


Applying induction, we can assume that A(g) and B(g) have already been proved 
to form finite groups G’ and G”. Consider the homomorphism f: G > G’ x G” 
given by f(g) = (A(g), B(g)); its kernel consists of elements g € G corresponding 


to matrixes 
E C(g) 
0 ES) 


It is easy to see that if K is a field of characteristic 0 and C(g) # 0 then this matrix 
cannot be of finite order: taking its mth power just multiplies C(g) by m. Hence 
the kernel of f consists only of the unit element, that is, G is contained in a finite 
group G’ x G", and hence is itself finite. 
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Wedderburn’s theorem entirely reduces the study of semisimple algebras of 
finite rank over a field K to that of division algebras of finite rank over the same 
field. We now concentrate on this problem. If D is a division algebra of finite 
rank over K and L the centre of D then Lis a finite extension of K and we can 
consider D as an algebra over L. Hence the problem divides into two: to study 
finite field extensions, which is a question of commutative algebra or Galois 
theory, and that of division algebras of finite rank over a field which is its centre. 
If an algebra D of finite rank over a field K has K as its centre, then we say that 
D is acentral algebra over K. 
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The question of the existence of central division algebras of finite rank over 
a given field K and of their structure depends in a very essential way on particular 
properties of K. We have already met one very simple result in this direction: 
over an algebraically closed field K there do not exist division algebras of finite 
rank other than K. In particular this holds for the complex number field. 

For the case of the real number field the situation is not much more complicated. 


I. Frobenius’ Theorem. The only division algebras of finite rank over the 
real number field R are R itself, the complex number field C, and the quaternion 
algebra H. 


Here are another two cases when the situation is simple. 


II. Wedderburn’s Theorem. Over a finite field K, there do not exist any central 
division algebras of finite rank other than K itself. 


In other words, a finite division algebra is commutative. This is of course 
interesting for projective geometry, since it shows that for finite projective 
geometries of dimension >2, Pappus’ theorem follows from the other axioms 
(and in dimension 2, from Desargues’ theorem). 


III. Tsen’s Theorem. If K is an algebraically closed field and C is an irreducible 
algebraic curve over K, then there do not exist any central division algebras of 
finite rank over K(C) other than K(C) itself. 


The three cases in which we have asserted that over some field K there do 
not exist any central division algebras of finite rank other than K itself can all 
be unified by one property: a field K is quasi-algebraically closed if for every 
homogeneous polynomial Fi(t,,...,t,)¢ K[t,,...,t,] of degree less than the 
number n of variables, the equation 


F(x,,---,X,) = 0 


has a nonzero solution in K. 

It can be shown that if K is any quasi-algebraically closed field, then the only 
central division algebra of finite rank over K is K itself. On the other hand, each 
of the fields considered above is quasi-algebraically closed: algebraically closed 
fields, finite fields and function fields K(C) where K is algebraically closed and 
C is an irreducible algebraic curve. Tsen’s theorem is proved starting from this 
property, and this is one possible method of proof of Wedderburn’s theorem. 
The theorem that a finite field is quasi-algebraically closed 1s called Chevalley’s 
theorem. For the case of the field F,, this is an interesting property of congruences. 
The property of being quasi-algebraically closed is a direct weakening of alge- 
braically closure; this become obvious if we start with the following characterisa- 
tion of algebraically closed fields. 


Theorem. A field K is algebraically closed if and only if for every homogeneous 
polynomial F(t,,...,t,)€ K[t,,...,t, ] of degree less than or equal to the number 
n of variables, the equation 
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F(x,,...,X,) =0 (1) 
has a nonzero solution in K. 


Proof. Obviously, for an algebraically closed field K the equation (1) has a 
nonzero solution. Suppose that K is not algebraically closed. Then over K there 
is an irreducible polynomial P(t) of degree n > 1. The ring L = K[t]/(P) is a field 
extension of K of degree n, and is hence an algebra of rank n over K. Consider- 
ing the regular representation of this algebra, we take each element x € L to 
the matrix A, € M,(K). The determinant of the matrix A, is called the norm of 
x, and is denoted by N(x). From properties of representations (and of deter- 
minants) it follows that N(1) = 1, and N(xy) = N(x)N(y). From this if x 4 0 then 
N(x)N(x7!) = 1, and hence N(x) 4 0. Consider any basis e,,..., e, of L/K (for 
example the images of the elements 1, t,..., t” ! of K[t]); we write any element 
xe Lasx,e, +°'' + x,e, with x; € K. It is easy to see that N(x) is a polynomial 
of degree n in x,,..., X,, and setting 


F(X1,..-,Xn) — N(x1e, nas Xn€n) 


we get an example of an equation of type (1) with no solution. 

We proceed now to fields over which there do exist central division algebras 
of finite rank. Up to now we know one such field, the real number field R, over 
which there exists a central division algebra of rank 1 (R itself) and of rank 4 (the 
quaternions H). These numbers are not entirely fortuitous, as the following result 
shows. 


Lemma. The rank of a simple central algebra is a perfect square. 


The proof is based on the important notion of extending the ground field. If 
R is an algebra of finite rank over a field K and L 1s an arbitrary extension of 
K, we consider the module R @ x L (see §5); we define a multiplication on its 
generators a ® € by 


(a®¢)(b@n)=ab@en for abeR,é,neL. 


It is easy to verify that this turns R @, Linto a ring, which contains L (as 1 ® L), 
so that this is an algebra over L. If e,,..., e, is a basis of R over K then e, @ 1, 
..., &, @ 1 is a basis of R @, L over L. Hence the rank of R @ x L over L equals 
that of R over K. Passing from R to R @, Lis called extending the ground field. 
To put things simply, R @, Lis the algebra over Lhaving the same multiplication 
table as R. 

It is not hard to prove the following assertion: 


Theorem IV. The property that an algebra should be central and simple is 
preserved under extension of the ground field. 


Now we just have to take L to be the algebraic closure of K; by the general 
theory R ®, L= M,,(L), and hence the rank of R over K, equal to the rank of 
R @x L over Lis n’. 
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Thus the quaternion algebra realises the minimal possible value of the rank 
of a nontrivial central division algebra. The next case in order of difficulty, of 
interest mainly in number theory, is the case of the p-adic number field Q,, 
introduced in § 7. 


V. Hasse’s Theorem. For any n there exist p(n) central division algebras of rank 
n* over the field Q, (where ¢ is the Euler function). 


The proof of the theorem indicates a method of assigning to each such algebra 
D a certain primitive nth root of unity, which determines D. This root of unity 
is called the invariant of the division algebra D, and is denoted by y,(D). 


We consider the simplest example. For any field K of characteristic #2, 
suppose that a, b € K are two nonzero elements. Construct the algebra of rank 
4 over K with basis 1, i, j, k and multiplication table 


i? =a, j? =b, ji= —ij=k 


(the remaining products can be computed from these rules using associativity). 
The algebra so obtained is called a generalised quaternion algebra, and is denoted 
by (a, b). For example H = (—1, —1). It is easy to prove that the algebra (a, b) is 
simple and central, and any simple central algebra of rank 4 can be expressed in 
this form. Thus by the general theory of algebras, (a, b) is either a division algebra, 
or isomorphic to M,(K). Let us determine how to distinguish between these two 
cases. For this, by analogy with the quaternions, define the conjugate of the 
element x = a + fi + yj + 6k to be X = a — Pi — yj — ok. It is easy to see that 


xy=yx and xx = a? — af? — by? + abd’ e€ K. (2) 


Set N(x) = xx. It follows from (2) that N(xy) = N(x)N(y). Hence if N(x) = 0 for 
some nonzero x then (a, b) is not a division algebra: xx = 0, although x 4 0 and 
x £0. If on the other hand N(x) # 0 for every nonzero x, then x"! = N(x)'x 
and (a, b) is a division algebra. Thus (a, b) is a division algebra if and only if the 
equation a” — ap* — by” + abd? = 0 has only the zero solution for (a, B, y, 5) in 
K. This equation can be further simplified by writing it in the form 


a? — af? = by? — ad?), 


or 
a? —aB? N(a+ fi) a+ Bi 
b = Ss dh N —_ N ‘)— 2 2 
y? —ad* N(y + di) y + Oi (C+ i) = 6° — an 
_ a+ Bi ae oa, 
where € + yi = rare Thus the condition that (a, b) is a division algebra takes 
Y 


the form: the equation €* — ay* = b has no solutions in K. The homogeneous 
form of the same equation is 


€* — an? — bf* = 0. (3) 
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This should not have any solution in K, otherwise (a,b) = M,(K). It is easy to 
see that if (3) has a nonzero solution, then it has a solution for which € # 0. It 
then reduces to the equation 


ax? + by? = 1. (4) 


Now suppose that K is the rational number field @. Equation (3) is exactly 
the equation to which Legendre’s theorem stated at the end of §5 relates. This 
asserts that (3) has a solution in Q if and only if it has a solution in R and in all 
the fields Q,,. In this form the theorem gives us information about the generalised 
quaternion algebra (a, b) for a, b € Q. By what we have said above, the algebra 
C = (a,b) is isomorphic to M,(Q) if and only if C@ R = M,(R) and C@ Q, = 
M,(Q,) for all p. But the same line of argument can be carried further, to describe 
all generalised quaternion algebras over Q. That is, one can show that two such 
algebras C = (a,b) and C’ = (a’,b’) are isomorphic if and only ifC ® R= C’@R 
and C@ Q, = C’ @ Q, for all p. In other words, consider the invariants yu, of 
division algebras of rank 4 over Q, (which by definition are equal to — 1) and 
for a simple central algebra C over Q set 


u,(C) = u,(C @ Q,) = —1if C @ Q, is a division algebra 
u,(C) = 1if C® Q, = M,(Q,); 


moreover, set 
Ug(C) = —1 if C@ Risa division algebra (that is, = H); 
Up(C) = 1ifC @R = M,(R). 

Then the above result can be restated as follows: 


Theorem VI. A division algebra C of rank 4 over Q is determined by pp(C) and 
Ly(C) for all p, which we call the set of invariants of C. 


What can this set of invariants be? We have seen that not all zp(C) and u,(C) 
can be equal to 1 (because then, by Legendre’s theorem, C would not be a division 
algebra). Moreover, it is easy to prove that u,(C) = —1 holds for only a finite 
number of primes p. It turns out that there is only one more condition apart from 
these. 


Theorem VII. An arbitrary set of choices tp, ty = +1 (for each prime p) is the 
set of invariants of some central division algebra of rank 4 over Q if and only if: 
(a) not all wp and p, are 1; (b) only a finite number of them are —1; 


and (c) He [| up = 1. (5) 
Pp 
Amazingly, the relation (5) turns out just to be a restatement of Gauss’ law of 


quadratic reciprocity, which thus becomes one of the central results of the theory 
of division algebras over Q. 
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The results we have treated generalise immediately to arbitrary central division 
algebras of finite rank over Q. Let C be a central division algebra over Q of finite 
rank n*. The algebra C ® R = Cp is isomorphic either to M,(R), and then we set 
Up(C) = 1, or to M,,.(H), and then by definition we set up(C) = — 1. Similarly, 
for any prime number p the algebra C @ Q, is of the form M,(C,), where C, is a 
central division algebra over Q,. We set ,(C) = u,(C,) (see Hasse’s theorem, 
Theorem V above). We have the following results: 


VIII. The Hasse-Brauer-Noether Theorem. C ~ M,(Q) if and only if Cg = 
M,(R) and C @ Q, = M,(Q,) for all p, that is up(C) = u,(C) = 1 for all p. 


Hasse’s Theorem. Two central division algebras C and C’ over Q are isomorphic 
if and only ifC@R=C’ @®RandC OQ, = C ®@ Q, for all p, that is ug(C) = 
Mp(C’) and p,(C) = p,(C’) for all p. A set of numbers pg and jp, (for all p) can be 
realised as Up = Up(C) and p, = u,(C) for a central division algebra C over Q if 
and only if (a) u, # 1 for only a finite number of primes p; 


and (b) ve T] Mp = 1. (6) 


Entirely similar results hold for finite extensions K of Q, that is, algebraic 
number fields. They form part of class field theory. The analogue of relation (6) 
for any algebraic number field is a far-reaching generalisation of Gauss’ law of 
quadratic reciprocity. 

The sketch given here of the structure of division algebras over the rational 
number field can serve as an example of just how closely the structure of division 
algebras over a field K relate to delicate properties of K. 

We give one more example: the structure of central division algebras of finite 
rank over the field R(C) where C is a real algebraic curve. In this case, any central 
division algebra is a generalised quaternion algebra, and is even of the form 
(— 1, a) for ae R(C), a # 0. The algebra (— 1, a) is isomorphic to M,(R(C)) if and 
only if a(x) > O for every point x € C (including points at infinity in the projective 
plane). The function a(x) on C changes sign at a finite number of points of C, 
X,,---, X,; (Figure 14 illustrates the case of the curve 


Fig. 14 
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y? + (x? — 1)(x? — 4) =0 


and the function a = y). The division algebra (—1,a) is determined by these 
points x,,..., X, at which the sign changes. More complicated, but also more 
interesting is the example of the field C(C) where C is an algebraic surface. The 
structure of division algebras in this case reflects very delicate geometric proper- 
ties of the surface. We will return to these questions in § 12 and § 22. 
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We start with the notion of a transformation group: the notion of a group first 
arose in this form, and it is in this form that it occurs most often in mathematics 
and mathematical physics. 

A transformation of a set X is a 1-to-1 map f:X > X from X to itself, 
that is, a map for which there exists an inverse map f~': X > X, satisfying 
fiof=fof' =e. Here f og denotes the product of maps (that is, the com- 
posite, the map obtained by performing g and f successively): 


(fo g)(x) = f(g(x)) forxe X, (1) 


and e is the identity map 
e(x)=x forxeX. 


We say that a set G of transformations of X is a transformation group if it 
contains the identity transformation e and contains together with any g € G the 
inverse g™', and together with any g,, g, € G the product g, gp. 

Usually these conditions are fulfilled in an obvious way, because G is defined 
as the set of all transformations preserving some property. For example, the 
transformations of a vector space preserving scalar multiplication and the addi- 
tion of vectors (that is, g(ax) = ag(x) and g(x + y) = g(x) + g(y)); these form the 
group of nondegenerate linear transformations of the vector space. The trans- 
formations preserving the distance p(x, y) between points of a Euclidean space 
(that is, such that p(g(x), g(y)) = p(x, y)) form the group of motions. If the trans- 
formations are assumed to preserve a given point, then we have the group of 
orthogonal transformations. 

The group of transformations of one kind or another which preserve some 
object can often be interpreted as its set of symmetries. For example, the fact that 
an isosceles triangle is more symmetric that a nonisosceles triangle, and an 
equilateral triangle more symmetric that a nonequilateral isosceles triangle can 
be quantified as the fact that the number of motions of the plane taking the 
triangle to itself is different for these three types of triangles. It consists (a) of 
just the identity map for a nonisosceles triangle, (b) of the identity map and 
the reflection in the axis of symmetry for an isosceles triangle, and (c) of 6 
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transformations for the equilateral triangle, the identity, the rotations through 
120° and 240° through the centre 0 and the reflections in the three axes of 
symmetry (Figure 15). 


(a) (b) (c) 


Fig. 15 


We give a number of typical examples of different types of symmetry. 

The symmetry group of a pattern in the plane consists of all motions of the 
plane that take it to itself. For example, the symmetry group of the pattern 
depicted in Figure 16 consists of the following motions: translations in the vector 
OA, translation in OB followed by reflection in the axis OB, and all composites 
of these. 


~P PB eC 
YQ Qa 
P 2 PP 
Vy Q ag 


By a symmetry of a molecule we understand a motion of space which takes 
every atom of the molecule to an atom of the same element, and preserves all 
valency bonds between atoms. For example, the phosphorus molecule consists 
of 4 atoms situated at the vertexes of a regular tetrahedron, and its symmetry 
group coincides with the symmetry group of the tetrahedron, which we will 
discuss in detail in the following section § 13. 

Crystals have a large degree of symmetries, so that the symmetry group of a 
crystal is an important characteristic of the crystal. Here by a symmetry we 
understand a motion of space which preserves the position of the atoms of the 
crystal and all relations between them, taking each atom into an atom of the 
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same element. We describe as an example the symmetry group of the crystal of 
cooking salt NaCl, depicted in Figure 17. This consists of adjacent cubes, with 
alternate vertexes occupied by atoms of sodium Na(e) and chlorine Cl(o): 


The set of all symmetries is given (in a coordinate system chosen with the origin 
in one of the atoms, and axes along the sides of the cubes, which are considered 
to be of length 1) by permutations of the coordinate axes, reflections in the 
coordinate planes, and translations in vectors with integer coordinates. It can be 
expressed by the formulas 


xX, = 6x; +k, 
U 

Xo = x, +1, 

x3 = Cx;, +m, 


where (i,,i,,i3) is a permutation of (1, 2, 3), each of ¢, n,€ = +1, and (k, l,m) € Z°. 

Algebraic or analytic expressions may also have symmetries: a symmetry of a 
polynomial F(x,,...,x,) is a permutation of the unknowns x,, ..., x, which 
preserves F. From this, we get for example the term symmetric function, one 
preserved by all permutations. On the other hand, the function I (x; — X,) 1S 


i<k 
preserved only by even permutations. In general if F is a function defined on a 
set X, then a transformation of X which preserves F can be considered as a 
symmetry of F. In the preceding example, X was the finite set {x,,...,x,}. 

A function on 3-space of the form f(x? + y? + z) has all orthogonal trans- 
formations as symmetries. Physical phenomena often reflect symmetries of this 
type. For example, by E. Noether’s theorem, if a dynamical system on a manifold 
X is described by a Lagrangian Y which has a 1-parameter groups {g,} of 
transformations of X as symmetries, then the system has a first integral that 
can easily be written down. Thus in the case of the motion of a system of point 
bodies, invariance of # under translations leads to the conservation of momen- 
tum and invariance under rotations to the conservation of angular momentum. 
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For any of the types of algebraic objects considered up to now, fields, rings, 
algebras and modules, the symmetries are transformations of the corresponding 
sets preserving the basic operations. In this case they are called automorphisms. 

Thus the automorphisms of an n-dimensional vector space F over a field K 
are nondegenerate linear transformations; the group of these is denoted by GL(F) 
or GL(n, K). When a basis is chosen, they are given by nondegenerate n x n 
matrixes. Similarly, the automorphisms of the free module A” over a commuta- 
tive ring A form a group GL(n, A) and are given by matrixes with entries in A 
whose determinant is invertible in A. The group consisting of matrixes of deter- 
minant 1 is denoted by SL(n, A). 

An automorphism o of a ring K (in particular, of a field) is a transformation 
o such that 

ao(a + b) = a(a)+a(b) and a(ab) = a(a)o(b). (2) 
For example, if R = M,(K) then a nondegenerate matrix c €e GL(n, K) defines an 
automorphism o(a) = cac™! of R. 

The transformation o(z) = Z 1s an automorphism of the complex number field 
C as an R-algebra, that is, an automorphism of the field extension C/R. In a 
similar way, any field extension L/K of degree 2 (with the characteristic of K 4 2) 
has exactly 2 automorphisms. For it is easy to see that L = K(y) with y*? =ce K. 
Each automorphism is uniquely determined by its action on y, since o(a + by) = 
a + ba(y). But o preserves the field operations in L and fixes the elements of K, 
and therefore (a(y))? = a(y7) = o(c) = c. Hence o(y) = +y. Thus the only auto- 
morphism are the identity o(a + by) = a+ by and the automorphism with 
o(y) = —y, that is o(a + by) = a — by. Extensions having ‘less symmetry’ also 
exist. For example, the extension K = Q(y) where y* = 2, is of degree 3 over Q, 
and has only the identity automorphism. For as above, any automorphism a is 
determined by its action on y, and (a(y))? = 2. If o(y) = y, ¥ y then (y,/y)? = 1, 
so that ¢ = y,/y satisfies e> — 1 = 0; since e ¥ 1, it satisfies e7 + e+ 1 = 0, and 
hence (2e + 1)? = —3. Hence K must contain the field Q(./ —3). But K/Q and 
Q(./ —3)/Q have degree 3 and 2 respectively, and this contradicts the fact that 
the degree of an extension is divisible by that of any extension contained in it 
(§ 6, Formula (3)). 

Finally the symmetries of physical laws (by this, we understand coordinate 
transformations which preserve the law) are very important characteristics of 
these laws. Thus the laws of mechanics should be preserved on passing from one 
inertial coordinate system to another. The corresponding coordinate transforma- 
tions (for motion on a line) are of the form 


x’ =x — ut, t'=t (3) 
in the mechanics of Galileo and Newton, and 
a ye tt (v/c?)x (4) 


ST —(vfc?’ St = (v/e?’ 


in the mechanics of special relativity, where c is the speed of light in a vacuum. 
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The symmetry group given by the transformation formulas (3) is called the 
Galileo-Newton group, and that given by formulas (4) the Lorentz group. 

An example of another type of symmetry of physical laws is the so-called parity 
law, according to which all physical.laws should preserve their form if we 
simultaneously invert the sign of time, the sign of electrical charges, and the 
orientation of space. Here we have a group consisting of just two symmetries. 

We mention some very simple notions related to transformation groups. The 
orbit of an element x € X under a transformation group G of X is the set Gx of 
elements of the form g(x), as g runs through G. The stabiliser subgroup of x is the 
set G, of elements of G which fix x, that is, G, = {g|g(x) = x}. The stabiliser 
subgroup of an element is itself a transformation group contained in G. 

Consider the relation between two elements x, y e X that there should exist a 
transformation g € G such that g(x) = y; this is an equivalence relation, that is, 
it is reflexive, symmetric and transitive. To check this 1s just a rephrasing of 
the three properties in the definition of a transformation group. All elements 
equivalent to one another form an orbit of G. Hence X breaks up as the disjoint 
union of orbits; this set of orbits or orbit space is denoted by G\ X. If there 1s just 
one orbit, that 1s, if for any x, y € X there exists g € G with y = g(x) then we say 
that G is transitive. 

We now proceed to the notion of a group. This formalises only certain aspects 
of transformation groups: the fact that we can multiply transformations (Formula 
(1)), and the associativity law (fg)h = f(gh) for this multiplication (which can be 
verified immediately from the definition). 

A group is aset G with an operation defined on it (called multiplication), which 
sends any two elements g,, g € G into an element g,g, € G, and which satisfies: 

associativity: (9192)93 = 91(9293); 

existence of an identity: there exists an element e € G such that eg = ge = g for 
all g € G (it is easy to prove that e is unique); 

existence of an inverse: for any g € G there exists an element g~' € G such that 
gg 1 =g 1g = e (it is easy to prove that g™' is unique). 


From the associativity law, one proves easily that for any number of elements 
91> 92>-++> Gn, their product (in the given order) does not depend on the position 
of brackets, and can therefore be written g,g, °°: g,. The product g--- g of g with 
itself n times is written as g”, and (g7')" as g™". 

If multiplication is commutative, that is, g.g, = 9,g, for any g,, g, € G, we 
say that Gis an Abelian (or commutative) group. In essence, we have already seen 
this notion, since it is equivalent to that of module over the ring of integers Z. 
To emphasise this relation, the group operation in an Abelian group is usually 
called the sum, and written g, + g,. We then write 0 instead of e, —g instead of 
g ', and ng instead of g". For ne Z and geéG, ng is a product, defining a 
Z-module structure on G. 

Ifa group G has a finite number of elements, we say that Gis finite. The number 
of elements is called the order of G, and is denoted by |G|. 
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An isomorphism of two groups G, and G, is a 1-to-1 correspondence f: G > G, 
such that f(g,9>) = f(g;)f(g2). We say that two groups are isomorphic if there 
exists an isomorphism between them, and we then write G, = G,. 

A finite group G = {g,,...,9,} can be specified by its ‘multiplication table’, the 
so-called Cayley table. This is a square matrix with the product g;g, in the ith 
row and jth column. For example, if G is a group of order 2 then it consists of 
the identity element e and g # e, and it is easy to see that g* can only be e. Hence 
its Cayley table is of the form 


If G is of order 3 with identity element e and g # e then it is easy to see that 
g? #e and g* ¥g, so that G = {e,g,h} with h = g’. Just as easily, we see that 
gh = e. Hence the Cayley table of G is of the form 


g h 


ele gih 
g\g he 
hih e g 


An isomorphism of two groups means that (up to notation for the elements), 
their Cayley tables are the same. Of course, Cayley tables can only conveniently 
be written out for groups of fairly small order. There are other ways of specifying 
the operation in a group. For example, let G be the group of symmetries of an 
equilateral triangle (Figure 15, (c)). Let s denote the reflection in one of the 
heights, and t a rotation through 120° about the centre. Then t? is a rotation 
through 240°. It is easy to see that s, st and st? are reflections in the three heights. 
Thus all the elements of G can be written in the form 


e, t, t7, s, st, st?. (5) 
Obviously, 
s=e te. (6) 
Furthermore, it is easy to check that 
ts = st?. (7) 


These rules already determine the multiplication of the group. For example 


(st)? = stst = sst*t = s*t? = e, 
and 

t?5 = tts = tst* = st?t* = st. 
This method of describing a group 1s called specifying it by generators and 
relations; it will be described more precisely in § 14. In this notation, an isomor- 
phism of G with a group G’ means that G’' also has elements s’ and t’, in terms 
of which all the other elements of G’ can be expressed as in (5), and which satisfy 
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conditions (6) and (7). Consider for example the group GL(2, F,) of nondegen- 
erate 2 x 2 matrixes with entries in the field F,. It is easy to write them all out, 
since their columns are necessarily of the form 


obL] & Li ® 


and nonproportional columns (to give a nondegenerate matrix) means in the 
present case that they are distinct. We get 6 matrixes 


othr bls oblt oflo abbr a} 


, , 0 1 
One sees easily that this list is precisely of the form (5) with t’ = ; 1 and 


1 
s'= k 5 and that the relations (6) and (7) hold. Therefore this group 1s 


isomorphic to the symmetry group of an equilateral triangle. The isomorphism 
we have found seems quite mysterious. However, it can be described in a more 
meaningful way: for this we need to observe that symmetries of a triangle permute 
the 3 vertexes, and that in our case all of the 6 possible permutations are realised. 
The nondegenerate matrixes over F, act on the 3 column vectors (8), and also 
realise all possible permutations of these. Thus each of the two groups Is iso- 
morphic to the group of all permutations of 3 elements. 

From this and from many other examples, we see that isomorphisms may 
occur between groups whose elements are completely different in nature, and 
which arise in connection with different problems. The notion of isomorphism 
focuses attention just on the multiplication law in the groups, independently of 
the concrete character of their elements. We can imagine that there exists a 
certain ‘abstract’ group, the elements of which are just denoted by some symbols, 
for which a composition law is given (like a Cayley table), and that concrete 
groups are ‘realisations’ of it. Quite surprisingly, it often turns out that properties 
of the abstract group, that is, properties purely of the group operation, often have 
a lot to offer in understanding the concrete realisation, which the abstract group 
‘coordinatises’. A classic example is Galois theory, which will be discussed later. 
However, in the majority of applications of group theory, properties of the 
abstract group are mixed up together with properties of the concrete realisation. 

In the above, we have only given examples of transformation groups, and so 
as not to give the reader the impression that this is the only way in which groups 
arise naturally, we give here a number of examples of a different nature. The 
homotopy groups z,,(X) and the homology and cohomology groups H,(X) and 
H"(X) of a topological space X are examples of this kind, but we postpone a 
discussion of these until §§ 20-21. 


Example 1. The Ideal Class Group. The product of ideals in an arbitrary 
commutative ring A was defined in §4. This operation 1s associative and has an 
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identity (the ring A) but the inverse of an ideal almost never exists (even in Z). 
We identify together in one class all nonzero ideals which are isomorphic as 
A-modules (see § 5). The product operation can be transferred to classes, and for 
many rings A, these already form a group; this means simply that for each ideal 
I €(O) there exists an ideal J such that IJ is a principal ideal. In this case 
we speak of the ideal class group of A; it is denoted by Cl A. This happens 
for example in the case of the ring of algebraic integers of a finite extension K/Q 
(see §7). Here the group Cl A is finite. For the ring A of numbers of the form 
a+ b./—5 with a, be Z (§3, Example 12), CIA is of order 2. Thus all non- 
principal ideals are equivalent (that is, isomorphic as A-modules), and the prod- 
uct of any two of them is a principal ideal. 


Example 2. The Group of Extensions Ext,(A, B). Suppose that A and B are 
modules over a ring R. We say that a module C is an extension of A by Bif C 
contains a submodule B, isomorphic to B, and C/B, is isomorphic to A; the 
isomorphisms u: B = B, and v: C/B, = A are included in the notion of extension. 
The trivial extension is C = A @ B. The group Z/p?Z is a nontrivial extension of 
Z/pZ by Z/pZ. In exactly the same way, a 2-dimensional vector space together 


-, ; _ {aid 
with a linear transformation given in some basis by the matrix k | (a2x2 


Jordan block with eigenvalue 4) defines a module C over the ring K [x] which is 
a nontrivial extension of the module A by itself, where A corresponds to the 1 x 1 
matrix A. 

We say that two extensions C and C’ of A by B are equivalent if there exists 
an isomorphism 9: C = C’ taking B, c C into B, < C’' and compatible with the 
isomorphisms B, ~ B = B, and C/B, = A = C’/B;. The set of all extensions of 
A by B, considered up to equivalence, is denoted by Ext,p(A, B). For semisimple 
rings R it consists of the trivial extension only; in the general case it measures 
the failure of this typically semisimple situation. 

We can make Ext,(A, B) into a group. Let C, C’ € Ext,(A, B), with f: B = 
B, < C and g: C/B, = A the isomorphisms in the definition of C, and f’, g’ the 
same for C’. Consider the submodules D <c C @ C’ consisting of pairs (c, c’) for 
which g(c) = g'(c’), and Ec C@C’ of pairs (f(b), —f'(b)) for b € B; then set 
C” = D/E. The homomorphism f” given by f"(b) = (f(b), 0) + E = (0, f'(b)) + E 
defines an isomorphism of B with a submodule Bj < C”; and g” given by 
g"(c,c’) = g(c) = g'(c’) for (c, c’) € D defines a homomorphism of D onto A which 
is equal to 0 on E, that is, a homomorphism g”: E” > A, which one easily checks 
defines an isomorphism of C”/B; with A. Thus C” is an extension of A by B. The 
corresponding element C” € Ext,p(A, B) is called the sum of the extensions C and 
C’. It is not hard to check all the group axioms; the zero element is the trivial 
extension. 


Example 3. The Brauer Group. We define a group law on the set of central 
division algebras of finite rank over a given field K (see §11). Let R, and R, be 
two simple central algebras of finite rank over K. We define a multiplication in 
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the module R, ®, R, by 
(a, © az)(b, & bz) = a,b, @ azby 


on generators; then R, @, R, becomes an algebra over K, and it is not hard to 
prove that it is also simple and central. For example M, (K) © M,,,(K) = M,,n,(K). 
If D, and D, are two central division algebras of finite rank over K, then from 
what we have said, by Wedderburn’s theorem (§ 10, Theorem IV), D, ®; D, = 
M,,(D), where D is some central algebra; we say that D 1s the product of D, and 
D,. It can be shown that this defines a group; the inverse element of a division 
algebra D is the opposite algebra D’. This group is called the Brauer group of K, 
and is denoted by Br K. It can be shown that the description of division algebras 
over @, and @ given in §11, Theorems V and VIII also give the group law of 
the Brauer group of these field: we need only view the roots of unity appearing 
there as elements of the group of roots of unity. For example, Br Q, is isomorphic 
to the group of all roots of unity, and Br @ to the group of collections (yg, y1,,...) 
of roots of unity satisfying the conditions of § 11, Theorem VIII. The group Br R 
has order 2. 

The following sections §§ 13-15 of this book are devoted to a review of 
examples of the most commonly occuring types of groups and the most useful 
concrete groups. This gives a (necessarily very rough) survey of the different ways 
in which the general notion of group is related to the rest of mathematics (and 
not only to mathematics). But first of all, we must treat some of the simplest 
notions and properties which belong to the definition of group itself, and without 
which it would be awkward to consider the examples. 

A subgroup of a group G is a subset which contains together with any element 
its inverse, and together with any two elements their product. An example of a 
subgroup is the stabiliser subgroup of any point in a transformation group. Let 
{g,} be an arbitrary set of elements of G; the set of all products of the g, and 
their inverses (taken in any order) is easily seen to form a subgroup of G. This is 
called the subgroup generated by {g,}. If this group is the whole of G then we 
say that the g, are generators of G. For example, 1 is a generator of the group Z 
(written additively). In the symmetry group of the equilateral triangle, s and t are 
generators (see (5)). 

A homomorphism of a group G to G’ is a map f: G > G’ such that 


f(9192) = f(91)f(92). 


A homomorphism f of G to the transformation group of a set X is called an 
action of G on X. To specify an action, we need to define for every element g € G 
the corresponding transformation f(g) of X, that is, f(g)(x) for any x € X. Writing 
J(g)(x) in the short form g- x, we see that giving an action of G on X is the same 
thing as assigning to any ge G and x e X an element g-x € X (that is, a map 
G x X > X given by (g, x) g- x)) satisfying the condition: 


(9192)°X =91°(92°X) 
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(which is equivalent to f being a homomorphism from G to the group of all 
transformations of X). 

We say that two actions of the same group G on two different sets X and X’ 
are isomorphic if we can establish a 1-to-1 correspondence x <> x’ between ele- 
ments of X and X’ such that x x’ implies g:-x<+g-x’' for any gé G. 


Example 4. Consider the group G of real 2 x 2 matrixes with determinant 1. 
Such a matrix g = i | acts on the upper half-plane C* = {z|Imz > 0} of 
Y 


one complex variable by the rule z+» : : ; 
Z 
on C*. On the other hand, G acts on the set S of 2 x 2 positive definite symmetric 


matrixes, with g taking s into gsg*, where g* is the transpose matrix. If we write 


. We obviously get an action of G 


a b 
sas ; | we can characterise S by the conditions a > 0, ac — b? > 0, which 
Cc 


in the projective plane with homogeneous coordinates (a: b : c) define the interior 
K of the coni« with equation ac = b*. Obviously G also acts on & by projective 
transformations taking this conic to itself. A positive definite quadratic form 
ax* + 2bxy + cy? can be written in the form a(x — zy)(x — Zy) with ze C*. It is 


, _ ta b 
easy to see that taking the matrix h | to z, we get a 1-to-1 correspondence 
C 


between the sets 8 and C*, defining an isomorphism of the two actions of G. As 
is well known, C* and S define two interpretations of Lobachevsky plane 
geometry, the Poincaré and Cayley-Klein models. In either of these interpreta- 
tions, G defines the group of proper (orientation-preserving) motions of the 
Lobachevsky plane. 

There are three very important examples of an action of a group on itself: 


g' xX = gx, 
g:'x=xg', 
g°x =gxg' 


(the left-hand sides are the actions, the right-hand sides are in terms of multiplica- 
tion in G). These are called the left regular, the right regular and the adjoint action. 
The left regular and right regular actions are isomorphic: an isomorphism is 
given by the map x+> x * of G to itself. 

An action of a group G ona set X defines of course an action of any subgroup 
H < G. In particular, the left regular action defines an action of any subgroup 
H < G on the whole of G. The orbits of the transformation group obtained in 
this way are called the left cosets of G under H. Thus a left coset consists of all 
elements of the form hg, where g € G is some fixed element, and h runs through 
all possible elements of H; it is denoted by Hg. By what we have said above 
concerning orbits, any element g, € Hg defines the same coset, and the whole 
group decomposes as a disjoint union of cosets. Similarly, the right regular action 
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of a subgroup H c G defines the right cosets, of the form gH. The orbits of the 
adjoint representation are called the conjugacy classes of elements, and elements 
belonging to one orbit are conjugate; g, and g, are conjugate if g, = gg,g! for 
some g € G. If H < G is a subgroup then for fixed g € G, all the elements ghg™! 
with h € H are easily seen to form a subgroup; this is called a conjugate subgroup 
of H, and is denoted by gHg™!. For example, if G is a transformation group of 
a set X and g € G,x € X and y = gx, then the stabilisers of x and y are conjugate: 
G, = gG,g"'. The number of left cosets of a subgroup H c G (finite or otherwise) 
is called the index of H, and denoted by (G: H). If G is a finite group, then the 
number of elements in each coset of H is equal to the order |H|, and hence 


|G| = |H|-(G: H). (9) 


In particular, |H| divides |G|. 

Suppose that the action of a group G on a set X is transitive. Then for any x, 
y é X there exists g € G such that g(x) = y, and all such elements form a right 
coset gG, under the stabiliser of x. We get in this way a 1-to-1 correspondence 
between elements of X and cosets of G, in G. If X is finite and we write |X| for 
the number of elements of X, then we see that 


|X| = (G: G,). 


If the action of G is not transitive, let X, be its orbits; then X = ) X,. Choosing 
a representative x, in each orbit X, we get a 1-to-1 correspondence between the 
elements of the orbit and cosets of G, in G. In particular, if X is finite, then 


|X| =) (G:G,), (10) 


where G, are the stabilisers of the representatives chosen in each orbit. 

The image Im f of a homomorphism f: G > G’ is the set of all elements of the 
form f(g) with g € G; it is a subgroup of G’. The kernel Ker f of f is the set of 
elements g € G such that f(g) = e. The kernel is of course a subgroup, but satisfies 
an additional condition: 


gthge Kerf if heKerfandgeG. (11) 


The verification of this is trivial. A subgroup N c G satisfying condition (11) is 
called a normal subgroup.? In other words, a normal subgroup must be invariant 
under the adjoint action, g° Ng = N. We write N <1 G to denote the fact that N 
is a normal subgroup of G. The definition of a normal subgroup 1s equivalent to 
the fact that any left coset of N is also a right coset, gN = Ng. Thus, although 
the left and right regular actions of a normal subgroup on the group will not in 
general be the same, they have the same orbits. 

The decomposition of G into cosets of a normal subgroup N is compatible 
with multiplication: if we replace g, or g, by an element in the same coset then 


3A literal translation of the Russian term, derived from the German Normalteiler, would be normal 
divisor, containing the idea of a factor in a product (compare § 16). 
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919, also remains in the same coset. This is a tautological reformulation of the 
definition of a normal subgroup. It follows from this that multiplication can be 
carried over to cosets, which then form a group, called the quotient group by N. 
This is denoted by G/N, and taking an element g € G into the coset gN defines 
a homomorphism G > G/N, called the canonical homomorphism. 

We have relations which are already familiar from the theory of rings and 
modules. 


Homomorphisms Theorem. The image of a homomorphism is isomorphic to the 
quotient group by its kernel. 

Any homomorphism f reduces to a canonical one: there exists an isomorphism 
of G/Ker f and Imf such that composing with the canonical homomorphism 
G — G/Ker f gives f. 


Example 5. Consider the group G of all motions of the Euclidean plane E, that 
is, of all transformations preserving the distance between points. This action can 
be extended to the vector space L of all free vectors of E: if x, ye E and 
xy denotes the vector with origin x and end point y then g(xy) is by definition 
the vector g(x)g(y). It is easy to check that if xy = x,y, (that is, xy and x, y, are 
equal and parallel) then also g(x)g(y) = g(x,)g(y,), so that the transformation g 
is defined without ambiguity. In L the transformation g is orthogonal. The map 
gt gis ahomomorphism from the group of motions to the group of orthogonal 
transformations. The image of this homomorphism is the whole of the group of 
orthogonal transformations. Its kernel consists of translations of E in vectors 
ue L. Thus the translation group is a normal subgroup. This can also be verified 
directly: if T, is a translation in a vector u, then one sees easily that gT,g-* = Ty). 
We see that the group of orthogonal transformations is isomorphic to the 
quotient group of the group of motions by the normal subgroup of translations. 
In the present case this is easy to see directly, by choosing some point O ¢€ E. 
Then any motion can be written in the form g = Tog’, where g’ € Gg (the 
stabiliser of the point O). Obviously, G, is isomorphic to the group of orthogonal 
transformations, and taking the coset T,g’ into g’ gives our homomorphism. 

Let g be an element of G. The map g,: Z — G given by @,(n) = g” is obviously 
a homomorphism; its image consists of all powers of g, and is called the cyclic 
subgroup generated by g, and is denoted by <g>. If there exists an element g such 
that G = <g> (that is, all elements of G are powers of g), then we say that G is 
cyclic, and g 1s a generator. An example of a cyclic group is the group Z of integers 
under addition; it has the generator 1. Every subgroup of Z consists either of 0 
only, or is the subgroup kZ of all multiples in Z of the smallest positive number 
k contained in it, that is, it is also cyclic. Returning to the homomorphism 
~,: Z — G, we can say that either Ker g, = 0 (which means that all the powers 
g" of g are distinct), or Ker g, = kZ; this means that g* = e, and <g> is isomorphic 
to the cyclic group Z/kZ of order k. In the first case we say that g is an element 
of infinite order, and in the second, of order k. If G is a finite group then by (9), 
k divides |G|. 
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Of all the methods of constructing groups, we mention here the simplest. 
Suppose that G, and G, are two groups. Consider the set of all pairs (g,,g,) with 
g,€G, and g,¢G,. The operation on pairs is defined element-by-element: 
(91592) (9192) = (91.91.9292). It is easy to see that we get in this way a group. It 
is called the direct product of G, and G, and denoted by G, x G,. If e, and 
e, are the identity elements of G, and G, then the maps g,+>(g,,e,) and 
gz+> (e,,g92) are isomorphisms of G, and G, with subgroups of G, x G,. Usually 
we identify elements of G, and G, with their images under these isomorphisms, 
that is, we write g, for (g,,e,) and g, for (e,,g,). Then G, and G, are subgroups 
of G, x G,;in G, x G, we have g,g, = 9,92 ifg, € G, and g, € G,. If G, and G, 
are Abelian (and written additively) then we obtain the operation of direct sum 
of Z-modules which we know from § 5, Example 5. 


§ 13. Examples of Groups: Finite Groups 


In the same way that the general notion of a group relates to transformation 
groups of an arbitrary set, finite groups relate to groups of transformations of a 
finite set; in this case, transformations are also called permutations. 


Example 1. The group of all transformations of a finite set X consisting of n 
elements x,,..., X, is called the symmetric group on x,,..., X,; it is denoted by 
GS, The stabiliser (©,),,, of x, is obviously isomorphic to ©,_,. Each coset g(©,,),., 
consists of all permutations that take x, into a given element x,;. Therefore the 
number of cosets is equal to n; hence |S,| = n-|S,_,|, and by induction |G,,| = n!. 


For i= 1,...,n — 1, write o; for the transformation which transposes x; and 
X;4+, and leaves fixed the remaining elements. Obviously 
a? =e. (1) 


Since any permutation of x,,..., xX, can be realised by successively transposing 
neighbouring elements, any element g € G,, is a product of a,, ..., 6,-,, that is, 
O,,.--, O,—-, are generators of S,,. Obviously 


o,0,= 0,0, if |i— j| > 1, (2) 


since then o; and a; transpose disjoint pairs of elements. The product o;0;,, 
cyclically permutes the 3 elements x;, x;,, and x,;,,, and therefore 


(6,0;,4, =e for 1<i<n—2. (3) 


Fact. It can be shown that the multiplication in S,, is entirely determined by the 
relations (1), (2) and (3) between the generators 6,, ..., Gn—1- 


The precise meaning of this assertion will be made clear in § 14. 
Let o € ©, be an arbitrary permutation, and H = <o) the cyclic subgroup 
generated by o. Under the action of H, the set X breaks up into k orbits X,,..., 


§ 13. Examples of Groups: Finite Groups 109 


X,; write n,,...,n, for the number of elements in these. Obviously 
n=n t+ nm. (4) 


The group H = <o) cyclically permutes the elements within each orbit X;,. 
Specifying the partition X = |} X; and the cyclic permutation of the elements 
within each X; (for example, writing X; = {X,,,Xz,5--- Xa, } where ox,, = X,,,, 
for j <n; — land OXq, = Xq,) uniquely determines the permutation o. This data 
is called the decomposition of o into cycles. The numbers n,, ..., n, are called 
the cycle type of the permutation. 

If o’ = gog * is a conjugate element then its decomposition into cycles is of 
the form X = |) gX; where gX; = {gx,,,...,9Xz, }, and so the numbers nj, ..., 
n, remain unchanged. Conversely, if o’ is any permutation of the same cycle type 
n,,...,n, then it is easy to construct a permutation g for which o’ = g~'ag. Thus 
two permutations are conjugate if and only if they have the same cycle type. In 
other words, the conjugacy classes of elements of S, are in 1-to-1 correspondence 
with collections of natural numbers n,, ..., n, satisfying (4); in particular, the 
number of conjugacy classes of elements of S,, is equal to the number of partitions 
of n as a sum of positive integers. 


Example 2. Suppose now that the elements x,,..., x,, are independent variables 
in the ring Z[x,,...,x,], and consider the polynomial 
i<j 


It is obvious that under a permutation o, the polynomial A is either unchanged, 
or changes sign: 


A(a(x,),..., o(x,)) = e(o)A(x,,...,X,), With e(o) = +1; 


and a+ «(a) is a homomorphism of G, into the group {+1} of order 2. The 
kernel of this group is called the alternating group, and is denoted by Y,; a 
permutation is even if o € U,, and odd if o ¢ UA,,. Obviously the index of W,, in G,, 
equals (G,, : U,,) = 2, and therefore |2,,| = n!/2. 

In many questions it is important to have a list of all normal subgroups of the 
groups ©,, and Y,,. The answer is as follows. 


Theorem I. For n 4 4 the group S,, has no normal subgroups other than {e}, L,, 
and S,; and U,, has none other than {e} and U,,. When n = 4 there exists in addition 
a normal subgroup (in both S, and Y,,) of order 4, consisting of e and of all 
permutations of cycle type (2, 2). 


Another way in which important examples of finite groups arise is in the study 
of finite subgroups of certain well-known groups. Let us treat a classical example, 
the finite subgroups of the group of orthogonal transformations of Euclidean 
space. 

The groups of most interest from the geometric and physical points of view 
are the groups acting on 3-dimensional space. But to have a simple model, we 
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treat first the analogous results for the plane. Orthogonal transformations that 
preserve the orientation will be called rotations; these also form a group. 


Example 3. Finite Groups of Rotations of the Plane. 
Theorem II. Finite groups of rotations of the plane are cyclic; any such group of 


2k 


; . . . ml 
order n consists of all rotations about a fixed point through angles of fork = 0, 
n 


l,...,n—l. 


We denote this group by C,. It can be characterised as the group of all sym- 
metries of an oriented regular n-gon (see Figure 18 for the case n = 7). 


Fig. 18 


Theorem III. 4 finite subgroup of the group of orthogonal transformations of 
the plane containing reflections is the group of all symmetries of a regular n-gon; 
this group is denoted by D,,. It has order 2n, and consists of the transformations in 
C,, and n reflections in the n axes of symmetry of the regular n-gon. 

The cases n = 1 and 2 are somewhat exceptional: D, is the group of order 2 
generated by asingle reflection, and D, the group of order 4 generated by reflections 
in the x- and y-axes. 


Example 4. The classical examples of finite groups of rotations of 3-space relate 
to regular polyhedrons: each regular polyhedron M has an associated group Gy, 
consisting of all rotations preserving M. The regularity of a polyhedron is 
reflected in the presence of a large number of symmetries. Suppose that M is a 
convex bounded n-dimensional polyhedron in n-space. Define a flag of M to be 
a set F = {Mo,M,,...,M,-,}, where M, is an i-dimensional face of M and 
M, < M,,,. We say that a polyhedron M is regular if its symmetry group G,, acts 
transitively on the set of all flags of M. In particular, for n = 3 the group Gy 
should act transitively on the set of pairs consisting of a vertex and an edge out 
of it. It is easy to see that the stabiliser of such a pair consists only of the identity 
transformation, so that the order of G,, equals the product of the number of 
vertexes of the regular polyhedron by the number of edges out of any vertex. All 
the regular polyhedrons were determined in antiquity (and they are sometimes 
called the Platonic solids). They are the tetrahedron, cube, octahedron, dodecahedron 
and icosahedron. To each regular polyhedron there is a related dual polyhedron, 
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Ly FX 


Fig. 19 


whose vertexes are the midpoints of the faces of the first; obviously both of these 
have the same group G,,. The tetrahedron is self-dual, the cube is dual to the 
octahedron, and the dodecahedron is dual to the icosahedron (see Figure 19). 
The corresponding groups are denoted by T, O, Y respectively; from what was 
said above, their orders are given by 


|T|= 12, |O|=24, |Y| = 60. 


In addition to these groups, there exist other obvious example of finite subgroups 
of the rotation group: the cyclic group C, of order n, consisting of rotations 


2k 
around an axis | through angles of ot for k = 0, 1,...,n — 1, and the dihedral 
n 


group D, of order 2n, which contains C, and in addition n rotations through an 
angle of z around axes lying in a plane orthogonal to I, meeting | and making 


, ; 2 ; 
angles with one another multiples of an The group D, can be viewed as the 
n 


group of rotations of a degenerated regular polyhedron, a plane n-gon (Figure 
20); in this form we have already met it in Example 3. 


The axes of rotation through z in the group D; 


Fig. 20 
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Theorem IV. The cyclic and dihedral groups and the groups of the tetrahedron, 
octahedron and icosahedron are all the finite subgroups of the group of rotations 
of 3-space. 


The precise meaning of this assertion is that every such group G is either cyclic, 
or there exists a regular polyhedron M such that G = Gy. Since all regular 
polyhedrons of one type can be obtained from one another by a rotation and a 
uniform dilation, it follows from this that the corresponding groups are conjugate 
subgroups of the group of rotations. 

The group of rotations of the regular tetrahedron acts on the set of vertexes. 
It only realises even permutations of them; this becomes obvious if we write the 
volume of the tetrahedron in the form 


1 1 1 1 
yadge | 8 
6 yi Y2 Y3 Ya 


Z, 2, 23 % 


where (x;, y,;,Z,) are the coordinates of its vertexes. From this it is clear that the 
tetrahedral group T is isomorphic to the alternating group W,. Algebraically, 
the action of T on the set of vertexes corresponds to its action by conjugacy on 
the set of subgroups of order 3 (each such subgroup consists of the rotations 
about an axis joining a vertex to the centre of the opposite side). 

Entirely similarly, subgroups of order 3 of the octahedral group O correspond 
to axes joining the centres of opposite faces of the octahedron. There are 4 
such axes, hence 4 such subgroups, and the action of O on them defines an 
isomorphism O X ©,. 

In the icosahedral group Y, consider first the elements of order 2; these are 
given by rotations through 180° about axes joining the midpoints of opposite 
edges. Since the number of edges is 30, the number of such axes is 15, and hence 
there are 15 elements of order 2. It can be shown that for each axis of order 2 
there exist another two orthogonal to it and to one another (for example the axis 
joining the midpoints of AB and CD in Figure 19, and the two others obtained 
by rotating this about an axis of order 3 passing through the centre of the triangle 
CFE). The 3 elements of order 2 defined by these axes, together with the identity 
element, form an Abelian group of order 4 isomorphic to Z/2 x Z/2; therefore 
the icosahedral group Y contains 5 such subgroups. The action of the groups Y 
by conjugacy on these 5 subgroups (or on the 5 triples of mutually orthogonal 
axes of order 2) defines an isomorphism Y = W.. 

One important relation between these groups often plays an important role. 
From the isomorphisms O =~ G, and T = %, and from Theorem I it follows that 
the group O contains a unique normal subgroup of index 2 isomorphic to T. We 
can see this inclusion if we truncate the angles of a regular tetrahedron in such 
a way that the plane sections divide its sides in half (Figure 21). What is left is a 
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regular octahedron; all the symmetries of the tetrahedron form the group T and 
preserve the inscribed octahedron, and hence are contained in O. 


BD 


Fig. 21 


The groups of regular polyhedrons occur in nature as the symmetry groups of 
molecules. For example (Figure 22), the molecule H,C—CCl, has symmetry 
group C3, the ethane molecule C, H, symmetry group D,, the methane molecule 
CH,, the tetrahedral group T (the atom C is at the centre of the tetrahedron, and 
the atoms H at the vertexes), and uranium hexafluoride UF, the octahedral group 
O (the atom U is at the centre of the octahedron, and the atoms F at the vertexes). 


F 
H 
F 
H 
H Nee F 
H F 
CH, UF 


Fig. 22 


A classification of finite subgroups of the group of all orthogonal transforma- 
tions follows easily from Theorem IV. The group /” of all orthogonal transforma- 
tions is the direct product I’ x {e, e’} where Jis the group of rotations of 3-space, 
and Z = {e,e’} is the group of order 2 consisting of the identity transformation 
and the central reflection e’ (with e’(x) = —x). It is a not too difficult algebraic 
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problem to investigate the subgroups of a direct product J’ x H if we know all 
the subgroups of J and of H. In the simplest case, when as in our case H = Z is 
a group of order 2, the answer is as follows. A subgroup G c I x Z 1s either 
entirely contained in J; or is of the form Gy x Z where Gy < I, or is obtained 
from a subgroup G c I by the following trick: choose a subgroup Gy < G of 
index 2 in G, and let G = G, U V be its decomposition into cosets under Gp; it 
is easy to check that the set of elements g, € Gp and e’v with v € V then forms a 
group, which we must take for G. For example, in the 2-dimensional case, the 
group D, is obtained in this way from G = C,, and Gy = C,. The construction 
of groups of this last type requires a listing of all rotation groups G and of their 
subgroups Go, of index 2. The corresponding group of orthogonal transforma- 
tions G is denoted by GG. We thus obtain the following result. 


Theorem V. The finite groups of orthogonal transformations of 3-dimensional 
space not consisting only of rotations are the following: 


C, x Z,D, x Z,T x Z,0 x Z, Y x Z, C,,C,, D2,D,, D,C,, OT. 
The final group arises because of the inclusion T < O illustrated in Figure 21. 


Recall that for any finite group G c GL(n, R) there exists a positive definite 
quadratic form f on R" invariant under G(§ 10, Example 4). From the fact that 
the form f can be reduced to a sum of squares by a nondegenerate linear 
transformation g, it follows, as we can see easily, that the group g ‘Gq con- 
sists of orthogonal transformations. Hence Theorems III and V give us also a 
classification of finite subgroups of GL(2, R) and GL(3, R). 


Example 5. Every finite group of rotations of 3-space preserves the sphere S 
centred at the origin, and hence can also be interpreted as a group of motions of 
spherical geometry. If we identity the sphere with the Riemann sphere of a 
complex variable z, then the fractional-linear transformations 


+2 tP vith a By,5eC and a5 — By £0, (5) 
yz + 0 


are realised as conformal transformations of the sphere S. The motions of S form 
part of the conformal transformations, and hence the finite groups of motion of 
the sphere we have constructed provide finite subgroups of the group of fractional- 
linear transformations. 

Theorem VI. All finite subgroups of the group of fractional-linear transforma- 
tions (5) are obtained in this way. It can be shown moreover that the subgroups 
corresponding to regular polyhedrons of the same type are conjugate in the group 
of fractional-linear transformations. 

One of the applications of this result is as follows. Let 


d*w dw 
Te + p(z) + q(z)w = 0 


be a differential equation with rational functions p(z), q(z) as coefficients, and 
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W,, W, two linearly independent solutions. The function v = w,/w, is a many- 
valued analytic function of the complex variable z. Moving in the z-plane around 
the poles of p(z) and q(z) replaces the functions w, and w, by linear combinations, 


; ; av + 
so that v transforms according to formula (5), that is pro Ot P Suppose now 
YU 


+0 
that v is an algebraic function. Then v has only a finite number of sheets, and 
hence we obtain a finite group of transformations of the form (5). Since we know 
all such groups, we can describe all the second order linear differential equations 
having algebraic solutions. 


Example 6. The Symmetry Group of Plane Lattices. The ring of integers Z is 
a discrete analogue of the real number field R, the module Z” a discrete analogue 
of the vector space R” and the group GL(n, Z) a discrete analogue of GL(n, R). 
Following this analogy, we now study the finite subgroups of GL(2, Z), and in 
the following example, of GL(3, Z). We will be interested in classifying these 
groups up to conjugacy in the groups GL(2, Z) and GL(3, Z) containing them. 
In the following section § 14 we will see that this problem has physical applica- 
tions in crystallography. 

The question we are considering can be given the following geometric inter- 
pretation. We think of Z” as a group C of vectors of the n-dimensional space R’; 
a group C c R’ of this form is called a lattice. Any group G < GL(n, Z) is realised 
as a group of linear transformations of R” preserving a lattice C. For any finite 
group G of linear transformations of R” there exists an invariant metric, that is, 
a positive definite quadratic form f(x) on R" such that f(g(x)) = f(x) for all g e G 
(see § 10, Example 4). A quadratic form defines on R" the structure of a Euclidean 
space, and the group G becomes a finite group of orthogonal transformations 
taking C into itself. Our problem then is equivalent to classifying the symmetry 
groups of lattices in Euclidean space R”. By a symmetry we mean of course an 
orthogonal transformation taking the lattice into itself. It is easy to see that 
the group of all symmetries of any lattice is finite. The groups G, and G, of 
symmetries of lattices C, and C, will correspond to conjugate subgroups in the 
group of integral matrixes with determinant +1 if there exists a linear trans- 
formation y which takes C, into C, and G, into G,. That is, C, = @(C,) and 
G, = 9G,9""', and @ takes the action of G, on C, into the action of G, on Cp. 
We will say that such lattices are equivalent. 

Lattices having nontrivial symmetries (other than the central reflection) are 
called Bravais lattices, and their symmetry groups Bravais groups. 

We now study this problem in the case of plane lattices. Our investigation 
breaks up into two stages. First of all, we must determine which finite groups of 
orthogonal transformations preserve some lattice (that is, consist of symmetries 
of it). For reasons which will become clear in the next section, these groups are 
called crystal classes (or crystallographic classes). They are of course contained 
in the list given by Theorem ITI. The basic tool in sorting out which ones occur 
is the following assertion, which is elementary to prove: a plane lattice can only 
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go into itself under a rotation about one of its points through an angle of 0, z, 


Theorem VII. There are 10 2-dimensional crystal classes, 
C1, C,, C3, Cy, Ce, D,, D2, D3, Dy, De. 

Figure 23 illustrates the fundamental parallelograms of lattices having the 
various symmetry groups; under each, we indicate the symmetry groups which 
the corresponding lattices admit. We have (from left to right): an arbitrary 
parallelogram, an arbitrary rectangle, an arbitrary rhombus, a square and a 
parallelogram composed of two equilateral triangles. The corresponding lattices 
are called: general, rectangular, rhombic, square and hexagonal, and are denoted 
by T gen: Trect> I homb> Iq and Dex: 


LY |ij)< el Ly 


general rectangular rhombic square hexagonal 
C,,C, C,, C,, D,, D, C,,C,, D,, Dy Ci, C), Cy, Cy, Cy, C3, Ce 
D,, D,, D, D,, D2, D;, Dg 
Fig. 23 


However, Theorem VII does not quite solve our problem. Inequivalent sym- 
metry groups may belong to the same crystal class. Algebraically, this means that 
two groups G and G’ c GL (2, Z) may be conjugate in the group of orthogonal 
transformations, but not in GL(2, Z). As an example we have the groups G and 
G’ of order 2, where G is generated by the matrix F | and G' by b a 
Geometrically these correspond to symmetries of order 2 of lattices whose 
fundamental parallelogram are the rectangle and the rhombus in Figure 23. The 
symmetry consists in the first case of a reflection in the horizontal side of the 
rectangle, and in the second of a reflection in the horizontal diagonal of the 
rhombus. They are inequivalent since the lattice corresponding to the rectangle 
has a basis of vectors which the symmetry multiplies by 1 and —1, whereas 
that corresponding to the rhombus is not generated by vectors invariant under 
the symmetry and vectors multiplied by —1 by the symmetry. However, this 
phenomenon does not occur very often. 


Theorem VIII. There are 13 inequivalent symmetry groups of plane lattices. 
C, (Tien)» C21 gen): C4 5q): C3(Thex)s Ce (L hex) 


D, (Trect)» D, (Thomb)> D, (Tiect)s D, (Tromp) 
Dag ), D; (Tirex)s D3 hex)» De hex): 
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We have indicated in brackets the lattices on which the given groups are realised 
as symmetry groups. 


We have already treated the example of the group D, realised on two lattices 
I, and Iynjomp: Lhe group D, is realised on the same lattices, and is obtained by 
adding to D, the central reflection. The most delicate are the realisations D, and 
D3 of D; as a symmetry group of J;,.,. Both of them are contained in D,, and are 
symmetry groups of equilateral triangles (as one expects of D;), but these triangles 
are inscribed in different ways in the hexagon preserved by D, (Figure 24). 


Fig. 24 


Example 7. The Symmetry Groups of Space Lattices. We consider finite 
subgroups of GL(3,Z), using without further explanation the terminology 
introduced in Example 6. 


Theorem IX. There are 32 3-dimensional crystal classes: 


Crystal system Crystal classes 

Triclinic C, x Z,C, 

Monoclinic C, x Z,C,,C,C, 

Orthorhombic D, x Z, D,, D.C, 

Trigonal D, x Z, D3, D3C3, C3 x Z, C, 
Tetragonal D, x Z, Dy, DyCyg, Dy Dz, CyZ, Cy C, 
Hexagonal Ds x Z, Dg, Ds C6, Dg C3, CZ, Ce, C6 C3 
Cubic O x Z,0,T, OT, T x Z, T 


The notation for the groups is taken from Theorem V. The groups are arranged in 
the table so that each row consists of the subgroups of the first group in the row. 


The series of groups are called crystal classes (or crystallographic classes) in 
crystallography, and have the exotic names given in the table. Each crystal class 
is characterised as a set of symmetries of some polyhedron. These polyhedrons 
are listed in Figure 25. (Their analogues in the plane are the parallelogram, 
rectangle, rhombus, square, and equilateral triangle.) 

The crystal classes can be represented in a very intuitive way, as in Figure 26, 
Table 1, taken from the book [Delone, Padurov and Aleksandrov 32 (1934) ]; 
the notation used in Table 1 is given below in Table 2. 
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Triclinic Monoclinic 
(arbitrary parallelipiped) (rectangular prism) 


Ls 


Orthorhombic (arbitrary Trigonal (cube compressed Tetragonal (rectangular 
rectangular parallelipiped) along a space diagonal) prism with square base) 


yo \ 
Hexagonal (rectangular prism Cubical 
with base made up of two (cube) 


equilateral triangles) 


Fig. 25 


We will not classify all types of inequivalent symmetry groups of 3-dimensional 
lattices. There are 72 different types. 

Higher-dimensional generalisations of the 2- and 3-dimensional constructions 
just treated cannot of course be studied in such detail. Here there are only a few 
general theorems. 


X. Jordan’s Theorem. For any n, a finite group G of motions of n-space has an 
Abelian normal subgroup A whose index (G : A) in G is bounded by a constant w(n) 
depending only on n. 


In the 3-dimensional case, the theorem is well illustrated by the dihedral group 
D,,, which contains the cyclic group C, as a normal subgroup of index 2. 
For the analogues of the Bravais groups one proves easily the following result. 


Theorem XI. For given n there are a finite number of nonisomorphic finite 
subgroups in the group of integral matrixes of determinant +1. 
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Thus the problem reduces to describing up to conjugacy the subgroups of 
GL(n, Z) isomorphic to a given group G. We meet here a problem which is 
analogous to the problem of representations of finite groups which we discussed 
in § 10 and will discuss further in § 17. The difference is that now, in place of linear 
transformations of a vector space (and nondegenerate matrixes) we have auto- 
morphisms of the module Z” (and integral n x n matrixes of determinant + 1). 
The corresponding notion (which we will not define any more precisely) is an 
n-dimensional integral representation. A basic result is another theorem of 
Jordan. 


Theorem XII. Jordan’s Theorem. Every finite group has only a finite number 
of inequivalent integral representations of given dimension. 


Example 8. The symmetric groups are a special case of an important class 
of groups, finite groups generated by reflections. Choose an orthonormal basis 
€,,..., e, In n-dimensional Euclidean space R"; we send a permutation o of the 
set {1,...,n} into the linear transformation 6 which permutes the vectors of this 
basis: é(e;) = e,,). The map o++é is an isomorphism of GS, with a certain 
subgroup S of the group of orthogonal transformations of R"; obviously, S is 
generated by the transformations 6; corresponding to the transpositions a;. The 
set of vectors fixed by 6; is the linear subspace L < R" with basis e,, ..., €;-1, 
€; + C41, Ci42, +++, €,3 Clearly, dim L = n — 1. If we consider a vector f ortho- 
gonal to the hyperplane L, for example f = e; — e;,, then 6; is given by the 
formulas 


G(x)=x for xeL; 
6(f)=—f for (f,L)=0. 


Any transformation s given by these formulas (for some choice of hyperplane L) 
is called a reflection. Obviously, s* = e. A group of orthogonal transformations 
which has a system of generators consisting of reflections is called a group 
generated by reflections. 

The basic results of the theory of finite groups generated by reflections are as 
follows. 


Theorem XIII. For any finite group G generated by reflections in a Euclidean 
space E there exists a uniquely determined decomposition 


E=E,@E,@:@E, 


of E as a direct sum of pairwise orthogonal subspaces E, invariant under G with 
the following properties: 

(1) Eg consists of vectors x € E with g(x) = x for all g € G; fori = 1,..., p, each 
E; has no subspaces invariant under G, apart from 0 and E,. 
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Fig. 26, Table 1 
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Fig. 26, Table 1 (continued) 
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/ centre of symmetry axis of rotation through x 
) axis of rotation through 27/3 an axis of rotation through 2/2 
axis of rotation through z/2 
axis of rotation through 2/3 composed with a reflection 
in an orthogonal plane 


axis of rotation through 7/3 
composed with a reflection 
in an orthogonal plane 


Fig. 26, Table 2 


(2) The group G is a direct sum of groups G, for i = 1,..., p, where G; consists 
of all transformations g € G fixing x € E, for j # i, and G; is also a group generated 
by reflections. 


For example, for the action of the symmetric group GS, on n-dimensional space 
IR" described above, we have the decomposition R” = Ey) © E,, where 
Ey = {a(e, +" + e,)} 
E, = {a4 ey + “ee + X,2,| Oy + “es + o = 0}. 
A group generated by reflections G is said to be irreducible if E does not have 


subspaces invariant under G and distinct from 0 and E. 
Let o,,..., 0, be a set of reflections. Obviously 


a7 =e. (7) 


There is a convenient way of describing certain other special relations between 
the generators, namely those of the form 


(o,0,)" =e. (8) 


For this we draw a graph with vertexes corresponding to the reflections o,,..., 
6,, and with two vertexes joined by an edge if relation (8) holds with m,; > 2. If 
m,, > 3 then we write the number m,; over the corresponding edge. It is easy to 
see that relation (8) with m, = 2 just means that o; and 6; commute. 


Theorem XIV. In each irreducible finite group G generated by reflections, there 
is a system of generators o,, ..., 6, that are themselves reflections, connected by 
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relations described in one of the graphs listed below. These relations define the 
group G. 


A, O-—-O—O-++-O—O 


b, 
B. O—OoO—Oo- 


D, 00000 


lb 
Fy O—O—-O—O 
1) 
H, O—O—O—o 
+) 


The subscript n (in A,, B, and so on) indicates the number of vertexes of the graph, 
and also the dimension of the space in which G acts. 


The graph 4, corresponds to the example we already know of the group ©,,,, 
acting in the space E, of the decomposition (6). It can be interpreted as the group 
of symmetries of the regular n-dimensional simplex, given in coordinates by the 
conditions w, +°°°+ 4,4, = landa, >0. 

The graph B, corresponds to the group consisting of all permutations and sign 
changes of the vectors of some orthonormal basis of a n-dimensional vector 
space. This is the group of symmetries of the n-dimensional cube (or octahedron); 
it has order 2"n!. Forn = 3 itisO x Z. The graph D, corresponds to a subgroup 
of index 2 in the group corresponding to B,. It consists of the permutations 
and multiplication of basis elements by numbers ¢; = +1 such that [][¢, = 1. 
For n=3 it is OT <O x Z (see Figure 21). The graph H, corresponds to 
the symmetry group of the icosahedron, and [,(p) to the dihedral group D,. 
The graphs H, and F, correspond to the symmetry groups of certain regular 
4-dimensional polyhedrons. 
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All the groups listed in Theorem XIV are crystal classes, except for H,, H, and 
I,(p) for p = Sand p 27. 


Example 9. There exists another method of constructing finite groups which 
we describe in detail later, and for the moment only hint at with the following ex- 
ample. Consider the group GL(n, F,) consisting of nondegenerate n x n matrixes 
with entries in a finite field F,. It is isomorphic to the group Aut; 7 of linear 
transformations of the vector space Fj; each such transformation is determined 
by a choice of basis in F7. Hence |GL/(n, F,)| is equal to the number of bases 
€,,..., é, In FY. We can take e, to be any of the q” — 1 nonzero vectors of F7; 
for given e,, we can take e, to be any of the g” — q vectors not proportional to 
e,; for given e, and e,, we can take e, to be any of the gq” — q? vectors which 
are not linear combinations of e, and e,, and so on. Therefore 


IGL(n, F,)| = (¢@" — D(a" — g)(q" — 47)..." - 4" "). (9) 


One of the applications of the groups GL(n, F,) is the proof of Theorem XI. 
We fix a prime number p # 2 and consider the homomorphism g,: Z > Z/(p) = 
F,. It defines a homomorphism of matrix groups 


Pp: GL(n, Z) > GL(n, F,), 


the kernel of which consists of matrixes of the form A = E + pB,anddet A = +1. 
Let us prove that any finite group G c GL(n, Z) is mapped isomorphically by ¢, 
onto some subgroup of GL(n,F,). Since there are only a finite number of sub- 
groups of GL(n,F,), the assertion will follow from this, and (9) will give an 
estimate for |G|. The kernel of G > GL(n, F,) is Gm Ker 9,, and we need to prove 
that this subgroup consists of the identity element only. For this we prove that 
Ker @, does not contain any elements of finite order other than E. Let A = 
E + p’B with Be M,(Z) and not all elements of B divisible by p, and suppose 
that A” = E. By the binomial theorem 


np’B+ ) (") p™ BY = 0. 
k=2 


But an elementary arithmetic argument shows that for p > 2 and k > 1, all the 
numbers inside the summation sign are divisible by a bigger power of p than np’, 
which gives a contradiction. 
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We proceed now to consider infinite groups. Of course, the purely negative 
characteristic of not being finite does not reflect the situations which really arise. 
Usually the infinite set of elements of a group is defined by some constructive 
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process or formula. This formula may contain some parameters, which may take 
integer values, or be real numbers, or even points of a manifold. This is the 
starting point of an informal classification: groups are called discrete in the first 
case, and continuous in the second. The simplest example of a discrete group 1s 
the infinite cyclic group, whose elements are of the form g” where n runs through 
all the integers. 

In any case, discrete groups often arise as transformation groups which are 
discrete, now in a more precise sense of the word. Thus, the group of integers 1s 
isomorphic to the group of those translations of the line which preserve the 
function sin 27x, consisting of translations x++ x + d by an integer d. First of all, 
let’s consider this situation. 

Let X be a topological space; in all the examples, X will be assumed to be 
locally compact and Haussdorff, and most frequently will be a manifold, either 
differentiable or complex analytic. A group G of automorphisms of X is discrete 
(or discontinuous) if for any compact subset K < X there exist only a finite set of 
transformations géG for which K 4gK is nonempty. We can introduce a 
topology on the set of orbits G\ X, by taking open sets to be the subsets whose 
inverse image under the canonical map f: X > G\X are open. If the stabiliser 
of every point x € X is just e, then we say that G acts freely. In this case any point 
€ €G\X has a neighbourhood whose inverse image under the canonical map 
f: X > G\X breaks up as a disjoint union of open sets, each of which is mapped 
homeomorphically by f. In other words, X is an unramified cover of the space 
G\ X.In particular, if X was a differentiable or analytic manifold, then G\ X will 
again be a manifold of the same type. 

If some group © is simultaneously a manifold (cases of this will be considered 
in the following § 15) then a subgroup G c G is said to be discrete if it is discrete 
under the left regular action on 6. 

The construction of quotient spaces G\ X is an important method of construct- 
ing new topological spaces. An intuitive way of representing them is related to 
the notion of a fundamental domain. By this we mean a set 9 c X such that the 
orbit of every point x € X meets J, and that of an interior point x of Z only 
meets Y in x itself. Then different points of one orbit belonging to the closure 
Q of 9 can only lie on the boundary of Y, and we can visualise the space G\ X 
as obtained by glueing J together, identifying points on the boundary that 
belong to one orbit. For example, the above group of translations of the line, 
consisting of transformations x+> x + d, has the interval [0, 1] as a fundamental 
domain. Identifying the boundary points, we get a circle. The space G\X is 
compact if and only if G has a fundamental domain with compact closure. 


Example 1. Discrete subgroups of the group of vectors of an n-dimensional 
real vector space R”. 


Theorem I. Any discrete subgroup of the group R" is isomorphic to Z™ withm <n, 
and consists of all linear combinations with integer coefficients of m linearly 
independent vectors €,,..., Em- 
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A group of this form is called a lattice. A fundamental domain of a lattice can 
be constructed by completing the system of vector e,,..., e,, to a basis e,,..., e, 
and then setting 


QD = fo ,e, +°* + a,e,|0 < 04,..., Gy < 1}. 


The space G\R” is compact if and only if m = n. In this case the fundamental 
domain @ is the parallelipiped constructed on the vectors e,, ..., e,. 

In this paragraph, we assume that the reader has met the notion of a Riemann 
surface and its genus. If n = 2 then the plane R? can be viewed as the plane of 
one complex variable; as such it is denoted by C. If G is a lattice in C then the 
quotient space G\C inherits from C the structure of a 1-dimensional complex 
manifold, that is, it is a compact Riemann surface. Its genus is equal to 1, and it 
can be shown that all compact Riemann surfaces of genus 1 can be obtained in 
this way. Meromorphic functions on the Riemann surface G\C are meromorphic 
functions f(z) of one complex variable, invariant under translation z+ z + « with 
a € G, that is, elliptic functions with elements of G as periods. 


Theorem. Two Riemann surfaces G,\C and G,\C constructed as above are 
conformally equivalent if and only if the lattices G, and G, are similar. 


Example 2. Crystallographic Groups. This is a direct generalisation of Example 
1 (or more precisely, of the case m = n). The atoms of a crystal are arranged in 
space in a discrete and extremely symmetrical way. This is seen in the fact that 
their relative position in space repeats itself indefinitely. More precisely, there 
exists a bounded domain 9 such that any point of space can be taken into a 
point of ZY by a symmetry of the crystal, that is, by a motion of space that 
preserves the physical properties of the crystal (taking every atom into an atom 
of the same element, and preserving all relations between atoms). In other words, 
the symmetry group G of the crystal is a discrete group of motions of 3-space R* 
and the space G\ R° is compact. In this connection, a crystallographic group is a 
discrete group G of motions of n-dimensional Euclidean space R” for which the 
quotient space G\R” is compact. 

The main result of the theory of crystallographic groups is the following: 


II. Bieberbach’s Theorem. The translations contained in a crystallographic 
group G form a normal subgroup A such that A\R" is compact, and the index (G: A) 
is finite. 

In the case n = 3, this means that for every crystal there exists a parallelipiped 
IT(a fundamental domain of the subgroup A <1 G, where G is the symmetry group 
of the crystal) such that all properties of the crystal in /7 and in its translates g/7 
(for g € A) are identical: and these translates fill the whole of space. J/ is called 
the repeating parallelipiped of the crystal. 

In the general case, by Theorem I, A consists of translations in vectors of some 
lattice C = Z". The finite group F = G/A is a symmetry group of C. From this, 
using Jordan’s Theorem (§ 13, Theorem XII), we can deduce: 
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Theorem III. The number of crystallographic groups ina space of given dimension 
n is finite. 


In this, two groups G, and G, are considered to be the same if one can be taken 
into the other by an affine transformation of R". It can be shown that this 
property is equivalent to G, and G, being isomorphic as abstract groups. 

The crystallographic groups, arising in connection with crystallography, also 
have a very natural group-theoretical characterisation: they are exactly the 
groups G which contain a normal subgroup of finite index isomorphic to Z", and 
not contained in any bigger Abelian subgroup. 

For crystallography it is extremely important to have a list of all the types of 
crystallographic groups in 3-dimensional space. Indeed, for any crystal, if we can 
indicate its group, its fundamental domain and the position of its atoms inside 
this, then we have determined the whole crystal, however far it grows. This gives 
a method of representing crystals in finite terms, which is actually used in 
compiling crystallographic tables. The list of all crystallographic groups is too 
long for us to give here, but some idea of it can be gained from the 2-dimensional 
case. 


Theorem IV. There are 17 different crystallographic groups in the plane. 


Each of these has a normal subgroup A < G which consists of translations in 
the vectors of some lattice C. The transformations g € G of course take this lattice 
into itself, with elements g € A acting just by translations. Hence the finite group 
F = G/Ais asymmetry group of this lattice, and belong to one of thirteen types 
described in § 13, Theorem VIII. It can happen however that two distinct groups 
G have the same lattice C and determine the same symmetry group of C. To give 
an example, consider the rectangular lattice /,,., and the symmetry group J of 
I, ., consisting of the identity and the reflection in the OB axis (Figure 27). 
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Fig. 27 


We can consider the group G generated by the group T of translations in vectors 
of C and the above group & of orthogonal transformations. The group G can be 
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characterised as the symmetry group of the pattern illustrated in Figure 27. On 
the other hand, consider the motion s consisting of translation in the vector DB 
together with a reflection in the axis OB, and the translation t in the vector OA. 
Write G, for the group generated by s and t. Then since s? is the translation in 
OB, the group T and the lattice C will be the same in the two cases, and the 
groups of symmetries of C generated by the motions of G and G, will also 
coincide. However, the groups G and G, are not isomorphic: G contains reflec- 
tions, but G,, as one checks easily, only contains translations and motions in 
which a reflection is combined with a translation along one of the vertical lines 
of the lattice. In particular, the group G contains an element of order 2, and G, 
does not. The group G, coincides with the symmetry group of the pattern 
illustrated in Figure 16. 

As in this example, from the 13 groups of symmetries listed in § 13, Theorem 
VIII we can form 13 2-dimensional crystallographic groups generated by trans- 
lations in vectors of the corresponding lattice and orthogonal transformations 
which are symmetries of it (acting as in the construction of the group G in the 
above example). In this case the stabiliser G, for x any point of the lattice will 
be isomorphic to the symmetry group of one of the 13 types which we chose. 
But in some cases a more delicate construction is possible (as the construction 
of G, in the example). Then the fixed subgroup will be smaller than the symmetry 
group, since some symmetries will only occur in G in combination with transla- 
tions (like the transformation s in the example). Thus we can construct a new 
group for the symmetry group Q,(J/,.,,) (the group G, we have constructed 
above), two groups for Z,(/;,.,) and one for D,(J;,). This gives 17 groups. 

We conclude with the example of the ‘new’ group corresponding to the 
symmetry Y,(/,,). We include in it the group C, of rotations of the plane about 
the point O through angles of 0, 2/2, 2, 3x/2 and the translation along an axis | 
through O combined with a reflection in this axis. The group G is generated by 
these transformations. If o is the rotation through z/2 then s’ = oso“! is a trans- 
formation similar to s, but having axis /’ orthogonal to |. The subgroup of trans- 
lations is generated by the translations s* and (s’)? along the axes | and I. The 
group G we have constructed is the symmetry group of the pattern of Figure 28. 

The groups we have constructed are also called the ‘ornament groups’ (or 
‘wallpaper pattern groups’ in English textbooks), since they can be interpreted 
as groups of symmetries of patterns in the plane. A complete list of ornaments 
corresponding to each of the 17 groups is contained, for example, in the book 
[Mal’tsev 80 (1956) ]. Figures 16, 27, 28 are examples of such patterns, especially 
thought up to characterise some of the groups. However, the ornaments created 
from purely aesthetic considerations are of course much more interesting. An 
example is the ornament of Figure 29, taken from [Speiser 98 (1937)]. It is 
interesting in that it is taken from a tomb in Thebes and was created by ancient 
Egyptian artisans. This shows that a deep understanding of the idea of symmetry, 
axiomatised in the notion of a group, developed very long ago. 

The situation in the 3-dimensional case is much more complicated. 
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Theorem V. The number of different crystallographic groups in 3-space is equal 
to 219. As in Theorem III, we consider groups to be the same if they are isomorphic, 
or (what is the same) if they can be taken into one another by an affine transforma- 
tion of space. 


All 219 groups are realised as the symmetry groups of genuine crystals. 

In crystallography, the number of different crystallographic groups is often 
given as 230. This comes from the fact that there, groups are only considered to 
be the same if they are taken into one another by a transformation which 
preserves the orientation of space; in the plane, the two notions of equivalence 
lead to the same classification. 

The theory of crystallographic groups explains the role of finite symmetry 
groups of lattices which we considered in § 13, Example 7. Symmetries of a crystal 
are given by the whole crystallographic group G, but because of the fact that the 
distances between atoms are very small, the group which is more noticeable from 
the macroscopic point of view is not the translation group A but the quotient 
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group G/A, a symmetry group of A. It is interesting to note that in the list of 
groups of § 13, Theorem VII, we only meet groups containing rotations through 
n/2, m/3 or multiples of these. Hence only these rotations can occur as symmetries 
of crystals. It is all the more astonishing that in real life we often meet other 
symmetries. For example, everyone knows the flowers of the geranium and the 
bluebell (campanula) whose petals have symmetry of order 5. In Figure 30, taken 
from the book [The life of plants 1 (1981)], we can see the 5-fold symmetry of 
the flowers of campanula (bluebell) (a) and Stapelia variegata (variegated carrion- 
flower) (b), and the 7-fold symmetry of the position of the leaves of the baobab 
tree (c). 


Fig. 30a—c 


Example 3. Non-Euclidean Crystallography. Discrete groups of motions are of 
interest not just for Euclidean spaces,-but also for other spaces. Here we discuss 
the case of the Lobachevsky plane A; we will only consider discrete groups of 
motions G of the plane 4 preserving the orientation, satisfying the two condi- 
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tions: (1) A motion g € G with g # e does not fix any point of A; and (2) the space 
G\A is compact. In the case of the Euclidean plane the only groups satisfying 
these conditions are the groups of translations in vectors of a lattice. 

The interest in groups G of this type arose in connection with the fact that 
under condition (1) the space G\A is a manifold, in the present case a surface. If 
we use Poincaré’s interpretation of the Lobachevsky plane in the upper half- 
plane C* of the plane of one complex variable, then the surface G\A inherits the 
complex structure of the upper half-plane, and (assuming condition (2)) is a 
compact Riemann surface. Meromorphic functions on the Riemann surface 
G\C* are meromorphic functions on C* invariant under G. They are called 
automorphic functions. This can be compared with the situation in Example 1, 
where we considered the space G\C. In that case we obtained compact Riemann 
surfaces of genus |. It is proved that in the case we are now considering we obtain 
precisely all compact Riemann surfaces of genus >1 (the Poincaré-Koebe uni- 
formisation theorem). Thus both of these cases together give a group-theoretical 
description of all compact Riemann surfaces (the remaining case of genus 0 is 
the Riemann sphere). 

As fundamental domain of the group G of the type under consideration we 
can take a 4p-gon in the Lobachevsky plane with alternate pairs of sides equal: 
that is a,, b,, a, b;,..., ay, b,, a,, b, where a; and a;, b; and b; are equal intervals. 
The transformations taking the side a; into a; or b, into b; (the directions of the 
sides which are identified under these are indicated in Figure 31) are the genera- 
tors of G. 


Fig. 31 


The unique relation which the polygon must satisfy is of course that the sides q;, 
a; and b; and b; are equal, and that the sum of its angles is 2z (this relates to the 
fact that in G\A all the vertexes are glued together as one point. 


Example 3a. An important particular case of Example 3 arises if we consider 
the Cayley-Klein (rather than the Poincaré) interpretation of Lobachevsky s dace. 
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Let f(x, y,z) be an indefinite quadratic form with integer coefficients. Consider 
the group G c SL(3, Z) consisting of integral transformations preserving f. Inter- 
preting x, y, z as homogeneous coordinates on the projective plane, we realise G 
as a group of projective transformations of the set f > 0, that is, a group of 
motions of the Lobachevsky plane A in the Cayley-Klein interpretation. It can 
be shown that G\A is compact if and only if the equation f(x, y, z) = 0 does not 
have any rational solutions other than (0, 0,0). (A criterion for this equation to 
have solutions is given by Legendre’s theorem, §7, Theorem III.) In this case 
condition (1) of Example 3 may not be satisfied; this will be the case if G contains 
an element of finite order other than e. But then there exists a subgroup G’ c G 
of finite index satisfying (1): applying the argument given at the end of § 13, 
Example 9, we can show that we can take the subgroup consisting of matrixes 
g € Gwith g = Emod p (for any choice of prime p # 2). 


Example 4. The group SL(2, Z) consisting of 2 x 2 integral matrixes with 
determinant 1. The significance of this group is related to the fact that in a 
2-dimensional lattice two bases e,, e, and f,, f, are related by 


fi = ae; + Ceo, f, = be, + de, 
with 


b 
a,b,c,deZ and ad—bc= +1, thatis b | Ee GL(2, Z). 
c 


If we also require that the direction of rotation from f, to f, is the same as that 
b , 
from e, toe,; then ad — bc = 1, thatis Fe | e€ SL(2, Z). A problem which crops 


up frequently is the classification of lattices in the Euclidean plane up to simi- 
larity. In Example 1 we saw that, for example, the classification of compact 
Riemann surfaces of genus 1 reduces to this. As there, we realise our plane as the 
complex plane C: then similarities are given by multiplication by nonzero com- 
plex numbers. Let z,, z, be a basis of a lattice C < C. We will suppose that the 
angle between z, and z, is <2, and choose the order of the vectors so that the 
rotation from z, to z, is anticlockwise. Applying a similarity, which can be 
expressed as multiplication by 2)", we get a similar lattice C’ with basis 1, z, where 
zZ = Z}'Z,, and, in view of the assumptions we have made, z lies in the upper 
half-plane C*. Then two bases 1, z and 1, w define similar lattices if the basis 
(bz + d,az + c) for some a, b, c, de Z with ad — bc = 1 can be taken into (1, w) 
by a similarity. This similarity must be given by (bz + d)~', and hence w = 
az +c 
bz+d a b 
C* as follows: for g = 5 4 € SL(2, Z), gz = 


. Thus we define the action of the group SL(2, Z) on the upper half-plane 
az +c 


bz +d 


_|-l 0 
Here the matrix acts as the identity, so that we have an action of 


1 O;}] -1 , 
the group SL(2, Z)/N, where N = {| 0 ttf 0 |. This quotient group 
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is denoted by PSL(2, Z); it is called the modular group. We see that the set of 
lattices up to similarity can be represented as G\C*, where G is the modular 
group. The modular group acts discretely on the upper half-plane. A fundamental 
domain for it is given by the shaded region of Figure 32. 
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Fig. 32 


This region J is called the modular figure. It is not bounded, but it has another 
important property. As is well known, the upper half-plane is a model of the 
Lobachevsky plane, and motions in it which preserve the orientation are given 


. + . 
as transformations zt> i + 7 where a, B, y, 0€ R with ad — By = 1. Thus the 
Z 


modular group is a discrete group of motions of the Lobachevsky plane. Now 
in the sense of Lobachevsky geometry, the modular figure is of finite area. In 
view of this the surface G\C* is not compact, but in the natural metric it has 
finite area. 

The modular group is analogous to the groups considered in Example 3, but 
is not one of them: to start with, some of its transformations have fixed points 
(for example z+» — 1/z), and secondly, G\A is not compact. The analogy with 
Example 3 will be clearer if we think of the modular figure from the point of view 
of Lobachevsky geometry. It is a triangle with one vertex at infinity, and the sides 
converging to this vertex become infinitely close to one another. This is more 
visible for the equivalent region Y’ of Figure 32. 


Example 5. Let G = GL/(n, Z). This is a discrete subgroup of GL(n, R), and acts 
on the same spaces as GL(n, R). Of particular importance is its action on the set 
KH, of real positive definite matrixes A defined up to a positive multiple: g(A) = 
gAg* (see § 12, Example 4 for the case n = 2). This action expresses the notion 
of integral equivalence of quadratic forms. A fundamental domain here is also 
noncompact, but has bounded volume (in the sense of the measure invariant 
under the action of GL(n, R)). 

The group GL(n, Z) belongs to the important class of arithmetic groups, which 
we will discuss in the next section. 
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Example 6. Free Groups. Consider a set of symbols s,,..., s, (for _ plicity 
we will think of this as a finite set, although the arguments do not c., nd on 
this). To each symbol s; we assign another symbol s;'. A word is a sequence of 
symbols s; and s;* in any order (written down next to one another), for example 
S,5285},'S3. The empty word e is also allowed, in which no symbol appears. A 
word is reduced if it does not at any point contain the symbols s; and s;* or s;* 
and s,; adjacent. The inverse of a word is the word in which the symbols are written 
out in the opposite order, with s; replaced by s;* and s;! by s;. The product of 
two words A and Bis the word obtained by writing B after A, and then cancelling 
out all adjacent pairs of s; and s;' until we get a reduced word (possibly the 
empty word). The set of reduced words with this operation of multiplication 
forms a group, as is checked without difficulty. This group is the free group on 
n generators, and is denoted by &. Obviously, the words s,, ..., s,, consisting 
of just one symbol are generators, s;' is the inverse of s,, and any word can be 
thought of as a product of the s; and s;*. 

The free group Y on generators x and y can be realised as a group of 
transformations of a 1-dimensional complex, that is, of a topological space 
consisting of points and segments joining them. For this, for the points we take 
all the different elements of Y,, and we join two points corresponding to reduced 
words A and B if B can be obtained from A by multiplying on the right with x, 
y, x! or y“/ (see Figure 33). 


y? 
yx'y 
y yx 
xy | ryx 
xy 
x72 | ont e x x2 
ry r2y-1 
y-' 
y~2 yx 


Fig. 33 
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Obvio.isly, if two words A and B are represented by points joined by a segment, 
then the same is true for CA and CB for any C € &,. Hence the left regular action 
(see §12) of FY defines an action of Y on this complex. If we introduce a 
‘biorientation’ on the complex, marking each segment with one of two types of 
arrow (right and upwards, as in Figure 33), it 1s easy to see that the group Y 
will be the full group of automorphisms of this bioriented complex. 


Consider any group G having n generators g,,..., g,. It is easy to see that the 
correspondence which takes a reduced word in s,,..., 5, into the same expression 
iN g,,---59,18 a homomorphism of “ onto G. Hence every group is a homomor- 


phic image of a free group, so that free groups play the same role in group theory 
as free modules in the theory of modules and the noncommutative polynomial 
ring in the theory of algebras (see § 5 and § 8). 

Let 


G= Y/N 


be a presentation of a group G as a quotient group of a free group by a normal 
subgroup N. Elements r,,...,7,, which, together with their conjugates, generate 
N are called defining relations for G. Obviously, the relations 


r, = @,..-.,l%, = e 


hold in G (where the r; are viewed as words in the generators g,, ..., g, Of G). 
Specifying defining relations uniquely determines the normal subgroup N, and 
hence G. This gives a precise meaning to the statement that a group 1s defined 
by relations, which we have already used. A group having a finite number of 
generators is finitely generated, and if it can also be presented by a finite number 
of relations, then it is finitely presented. For example, § 13, (1), (2) and (3) are 
defining relations of the symmetric group G,, and § 13, (7) and (8) are those of a 
finite group generated by reflections. It can be shown that the group PSL(2, Z) 


0 1 0 -1 
of Example 4 is generated by the matrixes s = ' | and t = | | and 


that the defining relations of the group are of the form 
s=e, t=e. 


Can we accept a presentation of a group in terms of generators and relations 
as an adequate description (even if the number of generators and relations 
is finite)? If g,, ..., g, are generators of a group G, then to have some idea 
of the group itself, we must know when different expressions of the form 
Giig?i2...gzim determine (in terms of the defining relations) the same element of 
the group. This question is called the word problem or the identity problem. It is 
trivial for free groups, and has been solved for certain very special classes of 
groups, for example for groups given by one relation, but in the general case it 
turns out to be impossibly hard. The same can be said of another problem of 
this type, that of knowing whether two groups given by generators and relations 
are isomorphic (this is called the isomorphism problem). 
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Both of these problems were raised to a new plane when mathematical logi- 
cians created a precise definition of an algorithm. Up to this point, one could 
only solve the identity problem and put forward a procedure, called an algorithm, 
for establishing the identity of two expressions given in terms of generators. Now 
however, it turns out that there is a well-posed problem: are the identity and 
isomorphism problems solvable? 

This was quickly settled. It turns out that, among groups given by generators 
and relations, there exist some 1n which the identity problem is not solvable, and 
groups for which the isomorphism problem 1s not solvable even with the identity 
group. 

Perhaps the most striking example of the need as a matter of principle to apply 
notions of mathematical logic to the study of purely group-theoretical problems 
is the following result. 


Higman’s Theorem. A group with a finite number of generators and infinite 
number of defining relations is isomorphic to a subgroup of a group defined by a 
finite number of relations if and only if the set of its relations is recursively 
enumerable. (The latter term, also relating to mathematical logic, formalises the 
intuitive notion of an inductive method of getting hold of all elements of some set 
by constructing them one by one.) 


Presentations of groups by generators and relations occur most frequently in 
topology. 


Example 7. The Fundamental Group. Let X be a topological space. Its funda- 
mental group consists of closed paths, considered up to continuous deformation. 
A path with starting point x e X and end point ye Y is a continuous map 
f:I—X ofthe interval J = [0 <t < 1] into X for which f(0) = x and f(1) = py. 
A path is closed if x = y. The composite of two paths f: I + X with starting point 
x and end point y and g: I > X with starting point y and end point z is the map 
fg: 1-4 X given by 


(fo)(t) = f(2t) for O<t< 1/2, 
(fg)(t)=g2t—1) for 1/2<t<l. 


Two paths f: I X and g: I > X with the same starting point x and end point 
y are homotopic if there exists a continuous map og: J ~ X of the square J = 
{0 < t,u < 1} such that 


9(t,0) =f), e&)= 90, 90,u)=x, oll) = y. 


Closed paths starting and ending at x,., considered up to homotopy, form a group 
under the multiplication defined as the composite of paths. This group is called 
the fundamental group of X; it is denoted by 2(X); (and also by z,(X), in view of 
the fact that groups z,,(X) for n = 1, 2, 3, ..., also exist, and will be defined in 
§ 20). In the general case, the fundamental group depends on the choice of the 
point x, and is denoted by z(X, x), but if any two points of X can be joined by 
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a path (we will always assume this in what follows), then the groups 2(X, x,) for 
Xq € X are all isomorphic. A space X is simply connected if n(X) = {e}. 

If X is a cell complex (that 1s, a union of disjoint ‘cells’, the images of balls of 
different dimensions) and has a single 0-dimensonal cell, then the fundamental 
group 2(X) has as generators the paths corresponding to 1-dimensional cells, 
and defining relations corresponding to 2-dimensional cells. For example, a 
1-dimensional complex does not have any 2-dimensional cells, and its fundamen- 
tal group is therefore free. The fundamental group of a ‘bouquet’ of n circles (see 
Figure 34 for the case n = 4) is a free group with n generators. 


Fig. 34 


An oriented compact surface homeomorphic to a sphere with p handles (see 
Figure 35 for the case p = 2) can be obtained by glueing a 4p-gon along alternate 
pairs of sides, as illustrated (for p = 2) in Figure 31. It is therefore a cell complex 
with a single 0-dimensional, 2p 1-dimensional and one 2-dimensional cell. Hence 


Fig. 35 


its fundamental group has 2p generators: s;, t,, 52, tz, ..., Sp, tp where s; is 


obtained from the paths a; and a;, and t; from b; and b;. There is one relation 
between them, which corresponds to going round the perimeter of the 4p-gon 
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keeping track of the direction of the sides: 


S,t,S1'ty'...S)tpSp tp’ =e. (1) 


The undecidability of the isomorphism problem for groups allows us to prove 
(by constructing manifolds with these fundamental groups) that the homeo- 
morphism problem for manifolds of dimension > 4 is undecidable. 

The fundamental group is closely related to the discrete transformation 
groups we considered earlier. In fact, if X 1s a space in which any two points can 
be joined by a path, then there exists a connected and simply connected space 
X and a group G isomorphic to 2(X) acting on it, in such a way that X = G\X. 
The space X is called the universal cover of X. Conversely, if X is connected and 
simply connected, and a group G acts discretely and freely on X then X is the 
universal cover of X = G\X, and G is isomorphic to 2(X). Thus in Example 3, 
a Riemann surface X is represented as G\C*; hence C* is the universal cover of 
X and G = 7n(X). From this we get that G is defined by the relation (1). The 
complex X illustrated in Figure 33 is obviously simply connected and the free 
group Y, acts freely on it. It is easy to see that a fundamental domain for this 
action is formed by two segments ex and ey. Hence %\X is a bouquet of two 
circles (as in Figure 34, but with n = 2), obtained by identifying e with x and with 
y, and X is the universal covering of this bouquet. 


Example 8. The Group of a Knot. A knot is a smooth closed curve J in 
3-dimensional space which does not intersect itself. The problem is that of 
classifying knots up to isotopy—a continuous deformation of space. The main 
invariant for this is the knot group of I, that is, the fundamental group of the 
complement, z(R* ~ I’). To get a pictorial representation of the knot, one 
projects it into the plane, indicating in the resulting diagram which curve goes 
over and which goes under at each crossover point (Figure 36). 


a 


A 


Fig. 36 


The generators of the knot group correspond to the segments into which the 
points of intersection divide the resulting curve (for example, the path y of Figure 
36 corresponds to the interval ABC). It can be shown that the defining relations 
correspond to the crossover points. The simplest knot, an unknotted circle, can 
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be projected without self-intersection, so that its knot group is the infinite cyclic 
group. The role of the knot group is illustrated for example by the following 
result. 


Theorem. A knot is isotopic to the unknotted circle if and only if its knot group 
is isomorphic to the infinite cyclic group. 


Here we run into an example where a substantial topological question leads 
to a particular case of the isomorphism problem. 


Example 9. Braid Groups. We consider a square ABCD in 3-space, and put on 
each of the sides AB and CD a collection of n points: P,,..., P, and Q,,..., Q,. 
A braid is a collection of n smooth disjoint curves contained in a cube constructed 
on ABCD with starting points P,,..., P, and end points Q,,..., 0, (but possibly 
in a different order (see Figure 37, (a)). 


B C 
Py Q, 
P, \ Q, \ 
P3 Q, 

A D 


(a) (b) (c) 


Fig. 37 


A braid is considered up to isotopy. Multiplication of braids is illustrated in 
Figure 37, (b). Identifying the points P, and Q; we obtain closed braids. The classes 
of closed braids up to isotopy form the braid group 2,. Generators of 2), are 
the braids o; for i= 1, ..., n — 1 in which only two threads are interchanged 
(Figure 37, (c)). The defining relations are of the form 


60,=o06, for j#it1, 
iYj jvi J Ff _ (2) 
O;9; 410; = Oj44 G;Oj41- 

The question of when two braids are isotopic can be restated as the identity 
problem in the braid group, defined by the relations (2). In this particular case 
the identity problem is solvable and in fact solved—this is one application of 
group theory to topology. 

The significance of the braid group consists in the fact that a braid can be 
thought of as defining a motion of an unordered set of n points on the plane 
which are not allowed to come together. The exact result is as follows. Let D 
denote the set of points (z,,...,z,)¢€C” for which z; =z; for some i 4 j. The 
symmetric group ©, acts on C" by permuting the coordinates, and preserves D. 
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Write X,, for the manifold S,\(C” \ D). The braid group 2, is the fundamental 
group of this space: 2, = 2(X,,). 

A point € € X, is a unordered set of n distinct complex numbers 2z,,..., Z,. It 
can be specified by giving the coefficients a,,..., a, of the polynomial 


t"+at" '+---+a,=(t —2;)...(t — Z,). 


Thus we can also say that 2, = x(C” \ 4), where C” is the space of the variables 
a,,...,a,, and A is obtained by setting to zero the discriminant of the polynomial 
with coefficients a,,..., a,- 


§ 15. Examples of Groups: Lie Groups and 
Algebraic Groups 


We now turn to the consideration of groups whose elements are given by 
continuously varying parameters; in other words, these are groups, occuring 
frequently in connection with questions of geometry or physics, whose set of 
elements itself has a geometry. This geometry may sometimes be very simple, but 
at other times far from trivial. 

For example, the group of translations xt x + a (for « € R) of the line, which 
reflects the coordinate change involved in changing the origin, is obviously 
isomorphic to the group of real numbers under addition, and is parametrised by 
points of the line. In the group of rotations of the plane about a fixed point O, 
each element is determined by the angle of rotation g, and two values of 
determine the same rotation if they differ by an integer multiple of 2x. Hence our 
group is isomorphic to R/2zZ, and is parametrised by points of a circle with 
centre at O: if we fix some starting point P on the circle, a rotation is determined 
by the point to which it takes P. We can view the same circle as a fundamental 
domain of the group 27Z, the interval [0,27] with its end points identified. 
However, from examples as simple as these we do not as yet get a feeling for the 
specific nature of the situations arising here. 


Example 1. The Group of Rotations of 3-Space. This group arises in connection 
with the description of the motion of a rigid body with one point fixed; we will 
assume that the body 1s 3-dimensional, not contained in a plane. We now attach 
rigidly to the body a coordinate system with centre at the fixed point O. Then a 
motion of the body defines a motion of the whole of space, namely, that for which 
the coordinates of each point in the moving coordinate system do not change; 
that is, 3-space moves together with the body. If we compare the position of all 
points at times t = 0 and t = t, then obviously they move in such a way that the 
distances between them do not change. In other words, passing from the initial 
position to the position at time t = tg is an orthogonal transformation op, of 
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3-space, fixing the origin O. However, since the transformation g, depends on t, 
and does so in a continuous way, it must preserve the orientation of 3-space. 
An orthogonal motion of 3-space preserving its orientation is called a rotation. 
It is in fact realised by a rotation through a definite angle around some axis; 
this is Euler’s theorem, which can be proved by elementary geometric considera- 
tions. Alternatively, it follows from the fact that the characteristic polynomial 
det(AE — A) of our transformation A, being a polynomial of degree 3 with 
negative constant term (det A > 0, since A is orientation-preserving), must have 
a positive root, which must be 1 since A is orthogonal; the corresponding 
eigenvector defines the axis of rotation. 

Thus the elements of the group of rotations describe all the possible positions 
occupied by a rigid body moving with a fixed point O, and any actual motion of 
this body is described by a curve in this group (with time t as parameter); the 
group of motions is the configuration space of a moving rigid body with a fixed 
point. What is this group like geometrically? To see it, let us specify a rotation 
about an axis / through an angle @ by a vector pointing in the direction of | and 
of length g with -—x <@ <7. Vectors of this form fill out a ball of radius z 
centred at O. However, points of the boundary sphere corresponding to the same 
axis / but with different values g = —z and @ = z define the same rotation. Thus 
the group of rotations can be described as the ball in 3-space R° with diametri- 
cally opposite points of the boundary identified. As is well known, under this 
identification we get 3-dimensional projective space P*. This is the geometric 
description of the group of rotations. 

The same description of the group of rotations of 3-space can be obtained in 
another way. Consider the group G consisting of quaternions g of modulus 1 (see 
§ 8, Example 5). Writing gq = a + bi + cj + dk, this is given by the equation 


a+b?4+¢? +d? =1, 


that is, it is the 3-dimensional sphere S*. Let H~ be the 3-dimensional vector 
space of purely imaginary quaternions, defined by Rex = 0. The group G acts 
on H™ by x++qxq™' for xe H™ and qe G. Since |qxq™*| = |q|-|x|-|q\-! = |x|, 
the action gives rise to an orthogonal transformation of H~. It is easy to see that 
q acts as the identity only if q = +1, so that we obtain a homomorphism f of G 
into the group of orthogonal transformations of 3-space, with kernel +1. Since 
G is connected, the image of f must be contained in the group of rotations, and 
by comparing dimensions it is easy to see that it must coincide with it. In other 
words, we have the following result. 


Theorem I. The group of rotations of 3-space is isomorphic to the quotient 
G/{ +1} of the group of quaternions of modulus 1 by the subgroup { +1}. 


Since G is a 3-dimensional sphere, it follows that the group of rotations of 
3-space is obtained from the sphere S° by identifying diametrically opposite 
points. We have thus once more obtained an identification of this group with 
projective 3-space. 


142 § 15. Examples of Groups: Lie Groups and Algebraic Groups 


We thus meet examples of transformation groups (the group of translations 
of the line, of rotations of the plane, and of rotations of 3-space) each of which 
has elements naturally parametrised in a 1-to-1 way by points of a certain 
manifold X (the line, the circle, projective 3-space). The next step is to abstract 
away from the specification of our group as a transformation group, and to 
assume that a manifold X adequately describes the set of elements of a group, 
and that a group law 1s specified on this set X. We thus arrive at the notion of 
a Lie group, which has two versions according as to whether we suppose that X 
is a differentiable or complex analytic manifold; the resulting group is called a 
differentiable or complex analytic Lie group. The definition is as follows: 


Definition. A group G which is at the same time a differentiable (or complex 
analytic) manifold is a Lie group if the maps 


G-G givenby greg! 
and 
GxG—G givenby (91,92)? 9192 
are differentiable (or complex analytic). 


The fact that the set of elements of a Lie group G is a manifold provides it with 
a geometry. The algebra (that is, the presence of a group law) means that this 
geometry is homogeneous. The elements of the left regular representation are 
called translations by elements of the group and they define a (differentiable or 
complex analytic) transitive transformation group of G. These allow us, for 
example, to start from a tangent vector t at the identity e € G, and to use left 
translation by ge G to get a tangent vector t, at any point g, that is a vector 
field on the whole of G. Vector fields of this form are said to be left-invariant. In 
the same way we can construct left-invariant (or right-invariant) differential forms 
on G. Finally, by the same method we can construct a left-invariant (or right- 
invariant) Riemannian metric on G. 

Theorem I, which describes the group of rotations of 3-space, gives an 
example of geometric properties that are typical of many Lie groups. Firstly, the 
homomorphism G > G/{ +1}, where G is the group of quaternions of modulus 
1 is obviously an unramified cover. Since G is diffeomorphic to the 3-sphere, it 
is connected and simply connected, and hence it is the universal cover of the 
group of rotations (see § 14, Example 7). It follows from this that the group of 
rotations has fundamental group x of order 2. We can get not just topological, 
but also differential geometric information on the group of rotations. Its 
invariant Riemannian metric can be made compatible with that of the group G. 
But G is the sphere S?, and is hence a manifold of positive Riemannian curvature. 
Hence the same is true of the group of rotations. It can be shown that for any 
compact Lie group the invariant Riemannian metric has nonnegative curvature, 
and that the directions in which the curvature is zero correspond to Abelian 
subgroups. 
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A closed submanifold H < G of a Lie group G which is at the same time a 
subgroup is a Lie subgroup of G. In this case one can show that the set of cosets 
H\G is again a manifold, and that the quotient map G — H\G and the action of 
G on H\G are differentiable (or complex analytic). Since the action of Gon H\G 
is transitive, H\G is a homogeneous manifold with respect to G. We have the 
relation 


dim G = dimH + dim H\G, (1) 


which corresponds to § 12, (9). 

In what follows we will treat in more detail two types of Lie groups, compact 
and complex analytic, and describe the most important examples of both types 
and the relations between them. 


A. Compact Lie Groups 


Example 2. Toruses. In a real n-dimensional vector space L, consider the lattice 
C = Ze, + -::- + Ze,, where e,, ..., e, is some basis of L. The quotient group 
T = L/Ciscompact. It is a Lie group and is called a torus. Since L = Re, + --- + 
Re,, we have 


T = (R/Z) x -» x (R/Z). 


The quotient group R/Z is a circle, and an n-dimensional torus is a direct product 
of n circles. Toruses have an enormous number of applications, and we indicate 
three of these. 

(a) A periodic function of period 27 is a function on the circle R/2z. As we 
will see later, this point of view gives a new way of looking at the theory of Fourier 
series. 

(b) Take L = C to be the plane of one complex variable; we have already 
considered this example in §14, Example 1. If C < C is a lattice then the torus 
C/C inherits from C the structure of a complex analytic manifold. As a complex 
manifold C/C is 1-dimensional. It can be shown that these are the unique 
compact complex analytic Lie groups of complex dimension 1. In a similar way, 
an arbitrary compact complex analytic Lie group is a torus C"/C, with C = 
Le, ++: + Ze,, a lattice in the 2n-dimensional real vector space C”. In partic- 
ular, such a group is necessarily Abelian. 

(c) In Arnol’d’s treatment of classical mechanics, Liouville’s theorem asserts 
that given a mechanical system with n degrees of freedom, if we know n indepen- 
dent first integrals J,,..., J, in involution (that is, all Poisson brackets vanish, 
[1., 1, ] = 0), then the system can be integrated by quadratures. The proof is based 
on the fact that in the 2n-dimensional phase space, the n-dimensional level 
manifold T., where c = (c,,...,C,), given by 


T. I, = c, (for a = 1,...,n) 
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is a torus. This follows at once from the fact that on the T,, the functions I, define 
n vector fields: these are given by the differential forms dJ, using the symplectic 
structure defined on phase space. Each vector field defines a 1-parameter group 
U,(t) of transformations, and the relations [J,, I, ] = 0 mean that the transforma- 
tions U,(t,) and U,(t,) commute. Thus the Lie group R” acts on the manifold T,, 
with (t,,...,t,) € R” corresponding to the transformation U,(t,), ..., U,(t,). It 
follows from this that T, is a quotient group of R” by the stabiliser subgroup H 
of some point x, € T,. Since T. is n-dimensional and compact (because kinetic 
energy, which is a positive definite form, is constant on it), T. = R"/H is a torus. 
Thus the motion of a point corresponding to the system always takes place on 
a torus, and moreover it can be proved that the point moves in a 1-dimensional 
subgroup of the torus (to this there corresponds the introduction of the so- 
called ‘action angles’). 

We move on to describe non-Abelian compact Lie groups. We will describe 
three series of groups (Examples 3, 4 and 5), usually called the classical groups. 
Each of these groups occurs in several versions, usually as certain matrix groups. 
For a matrix group G, we write SG for the set of all elements of G with 
determinant 1 (here S stands for ‘special’); the quotient groups of G and SG by 
their centres are denoted by PG and PSG (where P stands for ‘projective’). 


Example 3. The orthogonal group O(n) consists of all orthogonal transforma- 
tions of n-dimensional Euclidean space. This group acts on the unit sphere S”~' 
in n-space. If ee S""' with |e| = 1, the stabiliser subgroup of e acts on the 
hyperplane orthogonal to e, and is isomorphic to O(n — 1). Hence in view of (1), 
dim O(n) = dim O(n — 1) + n — 1, and therefore 


dim O(n) = (*). 


The group O(n) is not connected. It has an important subgroup of index 2, 
denoted by SO(n) and consisting of orthogonal transformations of determinant 
1. It is easy to prove that the group SO(n) is connected. If nis odd then the centre 
of SO(n) consists of E only, and if n > 4 is even, then it consists of E and — E. 
The quotient group of SO(n) by its centre is denoted by PSO(n). The group of 
motions of 3-space treated in Example 1 is SO(3). 

A natural generalisation of O(n) is related to considering an arbitrary non- 
degenerate quadratic form 


Xpte +x2—x2,,--- —x2,, with pt+q=n. 


Linear transformations of R" preserving this quadratic form form a Lie group 
denoted by O(p, q). This is a compact group only if p = 0 or q = 0. We will meet 
these groups with other values of p and gq later. 


Example 4. The unitary group U(n) consists of the unitary transformations of 
an n-dimensional Hermitian complex vector space. As in Example 3 it is proved 
that dim U(n) = n?. The determinant of a transformation in U(n) is a complex 
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number of absolute value 1; the transformations with determinant 1 form a 
subgroup SU(n) < U(n) of dimension n? — 1. The centre of SU(n) consists of the 
transformations cE where ¢” = 1. The quotient group by the centre is denoted 
by PSU(n). 


Example 5. Consider the n-dimensional vector space H" over the algebra of 
quaternions (§ 8, Example 5). On H” we define the scalar product with values in 
H 


(049.24 %al Vise e-da)) = YX (2) 


where y; are the conjugate quaternions. The group of linear transformations in 
Aut, H” that preserves this scalar product is called the unitary symplectic group 
and is denoted by SpU(n). For n = 1 we obtain the group of quaternions gq of 
modulus 1. 

In the general case, 


dim SpU(n) = 2n? + n. 


There are relations between the different classical Lie groups that are often 
useful. 
Theorem I can now be rewritten in the form 


SpU(1)/{ +1} = $O(3). (3) 


As we have seen, it follows from this that |z(SO(3))| = 2. 

An analogous representation for the groups SO(n), and even for all of the 
groups SO(p,q) (which are in general noncompact) can be obtained using 
the Clifford algebra (§8, Example 10). The fact that for any quaternion q the 
transformation xt+qxq_' takes the space of purely imaginary quaternions into 
itself is a special feature of the case n = 3. In the general case, consider the Clifford 
algebra C(L) corresponding to the space L over R with the metric 


2 2 2 2 
Xj + see + Xp — Xp+ —_ tte Xn+q 


with p+q=n. 

Recall that L < C(L). We introduce the group G of invertible elements a € C°(L) 
for which a~*La c L. Obviously G is a group. It is easy to check that for 
ae G the map xt»a™'xa for x € L preserves the metric of L. Thus we get the 


homomorphism 
f: G > O(p, q). 


It is easy to see that the kernel of f consists of a € R with a # 0. It can be proved 
(using the well-known fact that any orthogonal transformation can be expressed 
as a composite of reflections) that the image of f is SO(p, q) and that any element 
of G can be written in the form a = c,...c, with c; € L for some even r. It follows 
from this that for a € G the element aa* € R, where at a* is the involution of 
C(L) (see §8, Example 10), and if we set aa* = N(a) then for a, be G we have 
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N(ab) = N(a)N(b). Hence the elements a € G with N(a) = 1 form a group. This 
is called the spinor group and denoted by Spin(p, gq). When gq = 0 it 1s denoted by 
Spin(n). It is easy to see that the group Spin(p, qg) is connected. The kernel of the 
homomorphism f: Spin(p, q) > O(p, q) consists of {+ E}. The image depends on 
the numbers p and q. If g = 0 then obviously O(p, gq) = O(n) and it 1s easy to check 
that f(Spin(n)) = SO(n). That 1s, 


Spin(n)/{ +1} = SO). (4) 


Thus the group Spin(n) is a double covering of SO(n). Using an induction, it is 
easy to prove (starting with n = 3) that z(SO(n)) has order 1 or 2. But we have 
constructed a 2-sheeted cover Spin(n) — SO(n), and this proves that |z(SO(n))| = 
2, and that Spin(n) is simply connected. 

If p>0 and q> 0 then G contains both elements with positive and with 
negative norms, and their images define two distinct components of SO(p, q). The 
image of the elements with positive norm form a subgroup SO*(p, q) < SO(p, q) 
of index 2, and f(Spin(p, q)) = SO*(p, q), that is 


Spin(p, q)/{ + 1} = SO*(p, q). (5) 
As we saw in §8, Example 6, any quaternion can be written in the form 
g=2Z,+jz, with z,,2z,€C; (6) 


here j? = —1 and zj = jz. In the form (6) the quaternions form the 2-dimensional 
vector space C? over C, where multiplication by z € C is taken as multiplication 
on the right. Hence the left regular representation gives a representation of the 
quaternions by C-linear transformations of C7, that is by 2 x 2 matrixes. Taking 
, , , _ | -2,2 
the basis {1,j} sends the quaternion (6) to the matrix =| 
— 22 2, 

In the notation (6), the modulus of the quaternion gq is ./(|z,|? + |z2|*). Hence 
multiplication by a quaternion of modulus 1 gives a unitary transformation of 
C? with the metric |z,|* + |z,|?. Moreover, the determinant of the above matrix 
is also |z,|? + |z,|?, that is, in our case equal to 1. We thus get a homomorphism 


SpU(1) > SU(2), (7) 


whose kernel is 1. From considerations of dimension and because SU(2) is 
connected, we see that this is an isomorphism, so that SU(2) is isomorphic to the 
group of quaternions of modulus 1. Putting together (3) and (7) we get the 
isomorphism 

SO(3) = SU(2)/{ + 1}. (8) 


An elementary interpretation of this is as follows: consider the set L of 2 x 2 
Hermitian matrixes of trace 0. The group SU(2) acts on this by Ar+> UAU"! for 
U e SU(2). Introducing a metric on L by |A|? = —det A for Ae L makes L 
into a 3-dimensional Euclidean space. The transformation corresponding to 
U e SU(2) defines a transformation y € SO(3). This is the homomorphism 
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SU(2) > SO(3). 


Let # be the 4-dimensional real vector space of quaternions, with the metric 
given by the modulus. We define an action of SpU(1) x SpU(1) on F by 


(41,42)(x) =q,xq7> forxe L (9) 


It is easy to see that only the pairs (1,1) and (—1, —1) act trivially. Our 
transformations obviously preserve the modulus, and so are orthogonal. The 
determinant of the transformations (9) is |q,|-|q.|~*, that is in our case 1. We get 
a homomorphism 


SpU(1) x SpU(1) > SO(4), 


the kernel of which we know. By considerations of connectedness and dimension, 
the image is the whole of SO(4). We have obtained an isomorphism 


SO(4) = (SpU(1) x SpU(1))/H, (10) 


where H is a subgroup of the center of order 2. The whole centre of SpU(1) x 
SpU(1) is the product of two groups of order 2, the centres of SpU(1). Taking 
the quotient of the left-hand side of (10) by the centre of SO(4), we must on the 
left-hand side quotient SpU(1) x SpU(1) by the whole of its centre. But the 
quotient of SpU(1) by its centre is SO(3). Therefore 


PSO(4) = SO(3) x SO(3). (11) 


B. Complex Analytic Lie Groups 


The three series of complex Lie groups given in the following Examples 6, 7, 
8 are also called the classical groups. 


Example 6. The general linear group GL(n, C) is the group of nondegenerate 
linear transformations of an n-dimensional complex vector space. The dimension 
of GL(n,C) as a complex analytic manifold is obviously n*. It contains the 
subgroup SL(n,C) of linear transformations of determinant 1. The centre of 
GL(n,C) consists of scalar multiples of the identity matrix; the quotient of 
GLi(n, C) by its centre is denoted by PGL(n, C). 


Example 7. The subgroup of GL(n,C) consisting of transformations that 
preserve some nondegenerate quadratic form (x? + --- + x? in a suitable co- 
ordinate system) is denoted by O(n, C) and 1s called the orthogonal group. Its 


. . . . . fr 
dimension as a complex analytic variety is 5): 


Example 8. The subgroup of GL(2n,C) consisting of transformations that 
preserve some nondegenerate skew-symmetric form (of the form 
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Me 


(XiVn+i ~ Xn+idi) 


i=1 


in a suitable coordinate system) is called the symplectic group and denoted by 
Sp(2n, C). 

The compact and complex analytic classical groups are closely related; the 
simplest example of this is the circle group S' viewed as the subgroup S* c C* 
of complex numbers of absolute value 1: 


S! = SU(1) < GL(1,© = C*. 


More generally, the relation is as follows. Obviously U(n) < GL(n, C). It can be 
proved that U(n) is a maximal compact subgroup of GL(n, C), that 1s, it 1s not 
contained in any bigger compact subgroup. Any other compact subgroup of 
GL(n, C) is conjugate to a subgroup of U(n). The reason for this will be explained 
in §17B. Similarly, the group O(n) c O(n, C) is a maximal compact subgroup, 
and any compact subgroup of O(n, C) is conjugate to a subgroup of O(n). To 
establish the analogous result for the symplectic groups, we use the expression 
(6) of the quaternions as a 2-dimensional complex vector space, H = C + jC. 
Then a vector x = (X,,...,x,)€H" can be written as (Z1,...,2n,Zn4i5--+> Zan) 
where z, € C and x, = z, + j2,4,- In these coordinates, as one checks easily, the 
product (2) of Example 5 takes the form 


(x, y) = X Z;wi +] y (Z;Witn — WiZj+n) 


(where y = (jy1,.-.,y,) and y, = wy, +jw,4,). Thus if (x, y) = «+ jP then a is a 
Hermitian scalar product of the complex vectors x and y, and f is the value of 
the skew-symmetric form )(2;W;4, — W;Zi+,)- Every H-linear transformation @ 
of H" can be written as a C-linear transformation of C2" and by what we have 
said above, the condition g € SpU(n) means that @ € U(2n) and 9 € Sp(2n, C). 
Thus 


SpU(n) = U(2n) 7 Sp(2n, C). 


In particular, SpU(n) is a subgroup of Sp(2n,C). It is a maximal compact 
subgroup, and any compact subgroup of Sp(2n, C) is conjugate to a subgroup of 
SpU(n). 

In all three cases, the dimension of the ambient complex group (as a complex 
analytic manifold) is equal to the dimension of the compact subgroup (as a 
differentiable manifold). 

To conclude we treat some important Lie groups of small dimensions. 

The group O(3, 1) is called the Lorentz group, and SO(3, 1) the proper Lorentz 
group. If we interpret x,, x2, x3 aS Space coordinates and xX, as time, then 
preserving the form f = —x2 + x7 + x3 + x3 is equivalent to preserving the 
speed of light (which we consider to be equal to 1). The same group has another 
interpretation, that is just as important. Consider x9, X;, X2, X3 as homogeneous 
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coordinates in a 3-dimensional projective space P>. The equation f = 0 defines 
in P° a surface of degree 2, which can be written in inhomogeneous coordinates 
y; = X;/Xo for i = 1, 2, 3 as y? + y3 + y3 = 1. Thus this is the sphere S c P?. 
Considering transformations of O(3, 1) in homogeneous coordinates makes them 
into projective transformations of P* preserving S. Of course, multiplying all the 
coordinates through by —1 gives the identity transformation, so that the group 
PO(3,1) acts on P?. Obviously, this group also preserves the interior of S. 
But as is well known, in the Cayley-Klein model of non-Euclidean geometry, 
3-dimensional Lobachevsky space is represented exactly as the points of the 
interior of the sphere, and its motions by projective transformations preserving 
the sphere. This proves the next theorem. 


Theorem II. PO(3, 1) is isomorphic to the group of all motions of 3-dimensional 
Lobachevsky space, and PSO(3, 1) to the group of all proper (orientation-preserving) 
motions. 


Of course, this assertion is of a general nature: the group PO(n, 1) 1s isomorphic 
to the group of motions of n-dimensional Lobachevsky space. 

The Lorentz group O(3,1) has another important interpretation. This is 
based on considering the spin group Spin(3, 1) (see (5) above). In the course of 
constructing this group, we saw that for ae G the norm N(a) = aa* € R. But in 
our particular case this condition is sufficient: if aa* = «eR and « #0 then 
a-'La c L, that is « € G. In fact from a* = a we get a"! = «‘a*, and it follows 
from this that (a~!xa)* = a~'xa for x € L. On the other hand, a~' xa € C', and 
C' consists of linear combinations of the elements e; and the products e;e;e,, 
where the e; form an orthogonal basis of L. Of these, only the linear combinations 
of the e; do not change sign under x» x*, and hence a~! xa € L. Now we use the 
fact that we can find an explicit representation of the algebra C° as a matrix 
algebra (see §8, Example 11 and § 10, Example 6). If eo, e,, e2, e3 is the basis in 
which f has the form — x? + x? + x2 + x2 then 1, ege,, e9e, and e,e, generate 
the algebra M,(C), 1 and e,e,e,e, the algebra C, and the whole of C® is 
isomorphic to M,(C). It is easy to check that under this isomorphism the 
involution a+» a* corresponds to the map 


Parl 2 
b> 3 
y oO yp a 
and the condition aa* € R means that det A e Rand A e M,(C). Hence Spin(3, 1) 
is isomorphic to SL(2, C), and we get a homomorphism 
SL(2, C) > SO(3, 1). 
Its image is SO* (3, 1) and the kernel is {+1}. Thus 
PSL(2, C) = SO*(3, 1). 


The homomorphism SL(2, C) > SO(3, 1) has the following elementary inter- 
pretation. Consider the space L of 2 x 2 Hermitian matrixes, and the action of 
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SL(2,C) on L by At+>CAC* for AE L and Ce SL(2,C). Introduce on L the 
metric |A|? = —det A, which is of the form —x2 + x? + x2 + x2 in some basis. 
The action of SL(2,C) described above therefore defines a homomorphism 
SL(2, C) + O(3, 1). This is the same as our homomorphism SL(2, C) — SO(3, 1). 


C. Algebraic Groups 


We now consider a class of groups that gives interesting examples of Lie 
groups, discrete groups and finite groups all at one go. 

We treat only one case of these, the algebraic matrix groups (also called linear 
algebraic groups). These can be defined over an arbitrary field K as the subgroups 
of GL(n, K) given by algebraic equations with coefficients in K. Examples are: 
SL(n, K); O(f, K), the group of matrixes preserving f, for some quadratic form f 
with coefficients in K; Sp(2n, K); the group of upper-triangular matrixes (a,;) with 
a;; = Ofori <j and a,; ¥ 0, or its subgroup in which all a,; = 1. In particular, the 


1 
group consisting of 2 x 2 matrixes of the form 4 is isomorphic to the group 


0 
of elements of K under addition; this is denoted by G,. The group GL/(1, K) is 
isomorphic to the group of elements of K under multiplication, and is denoted 
by G,, or K*. If the field K is contained in R or C and G is an algebraic group 
defined over K, then the real or complex matrixes in G define a real Lie group 
G(R) or a complex analytic group G(C). The majority of the Lie groups we have 
considered are of this type. But the general notion is more flexible, since for 
example it allows us to consider algebraic groups over the rational number field. 
Thus considering the group O(f, Q) gives a group-theoretic method of studying 
arithmetical properties of a rational quadratic form f. Moreover, considering 
matrixes with integral entries and with determinant +1 in an algebraic matrix 
group G defined over Q, we obtain a discrete subgroup G(Z) of the Lie group 
G(R). For the groups G = SL(n), O(f), Sp(n) the quotient spaces G(Z)\G(R) are 
either compact or have finite volume (in the sense of the measure defined by an 
invariant measure on G(R)); for examples, see §14, Examples 3a—5. Groups of 
this form, and also their subgroups of finite index are called arithmetic groups; 
to treat this notion in the natural generality would require us to consider as well 
as @ any algebraic number field. 

Finally, algebraic groups such as GL(n), O(n) and Sp(n) can also be considered 
over finite fields, and they give interesting examples of finite groups. We have 
already met the group GL(n, F,) in § 13, Example 9. 

There exists another completely unexpected way in which discrete groups arise 
in connection with algebraic groups. If a matrix group G is defined over the 
rational number field K = Q then we can consider the groups G(Q) of its rational 
points, G(R) of its real points or G(Q,) of its p-adic points (see the end of §7). 
The most invariant way of treating the fields R and Q, on an equal footing is to 
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consider the ‘infinite product’ of G(R) and all of the G(Q,). We will not give a 
precise definition of this product, which is called the adéle group of G and denoted 
by Ga. Since G(Q) < G(R) and G(Q) c G(Q,) for all p we have a diagonal 
inclusion G(Q) c Ga. It turns out that G(Q) is a discrete subgroup of Ga. The 
idea that matrixes with rational entries should form a discrete subgroup is a very 
unfamiliar one, although the principle is easy to understand in the example 
G = G,, the group of all numbers under addition. If x € G(Q) is a rational number 
then the condition @,(x) < 1 for all p (see §7 for the definition of the valuation 
~,) means that x is an integer, and @z(x) < 1 then implies that x = 0. Ina number 
of cases (such as, for example, G = SL(n), O(/), Sp(n)) the quotient space Gg\ Ga 
is of finite volume. It can be shown that this volume is uniquely determined by 
the group G only. This volume is the so-called Tamagawa number t(G) of G, and 
is a very important arithmetic invariant. For example, if f is a rational positive 
definite quadratic form then it follows from the Minkowski-Hasse theorem (§ 7, 
Theorem IV) that the equation f(x) = a for rational x and a is solvable if and 
only if it is solvable in R (that is, a > 0) and in all Q, (that is the congruences 
f(x) = amod p’ are all solvable). But if these conditions are satisfied, we can give 
numerical characterisations of the number of integral solutions in terms of the 
number of solutions of the congruences f(x) = amod p’; this turns out to be 
equivalent to finding the Tamagawa number 1(G) (which is equal to 2: this is a 
reflection of the fact that |z(SO(n)| = 2). 
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The ideal result of ‘abstract’ group theory would be a description of all possible 
groups up to isomorphism, completely independently of concrete realisations of 
the groups. In this generality, the problem is of course entirely impracticable. 
More concretely one could envisage the problem (which is still very wide) of 
classifying all finite groups. Since only a finite number of Cayley tables (multi- 
plication tables) can be made up from a finite number of elements, there are only 
finitely many nonisomorphic groups of a given order; ideally one would like 
a rule specifying all finite groups of given order. For fairly small orders this 
can be done without much difficulty, and we run through the groups which 
arise. 

We should recall that for finite Abelian groups the answer is provided by the 
basic theorem on finitely generated modules over a principal ideal ring (see § 5, 
Example 6 and §6, Theorem II): this says that a finite Abelian group (written 
additively) can be represented as a direct sum of groups Z/(p*), where the p are 
prime numbers, and such a representation is unique. Thus only non-Abelian 
groups cause any difficulty. 
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Example 1. We now list all groups of order < 10 up to isomorphism: 
IG) =2: Gz Z/(2). 
IG|=3: G2 Z/(3). 
IGJ=4 Gz Z/(4) or Z/(2)@ ZK(2). 
IGJ=5: G2zZ/(5). 
For |G| = 6, a non-Abelian group appears for the first time, namely the group 


isomorphic to ©, or to the symmetry group of an equilateral triangle; defining 
relations for it are given in § 12, (6) and (7). Thus: 


IG] =6: Gz G, or Z/(2) @ Z/(3). 
IG|=7: G2Z/(7). 

For |G| = 8 there are already two nonisomorphic non-Abelian groups. One 
of these is the group D, of symmetries of the square; this is also the group 
generated by two elements s and t with defining relations s? = e, t* = e, (st)? =e 
(here s is a reflection in one of the medians and t is a rotation through 90°). The 
other non-Abelian group H, can be described in terms of the quaternion algebra 
(§8, Example 5); it consists of 1, i, j,k, —1, —i, —j, —k, multiplying together as 
quaternions. Thus: 

|G) = 8: G= D,, Hg, Z/(8), Z4) © Z/(2) or (Z/(2))%°. 
IG) =9: Gz 2Z/(9) or Z/(3) @ Z/(3). 

For |G| = 10 a non-Abelian group again arises, isomorphic to the group D, 

of symmetries of a regular pentagon, generated by elements s and t with defining 


relations s? = e, t? = e, (st) = e (where sis a reflection in an axis and t a rotation 
through 27/5). Thus: 


IG] =10: G2=D, or Z/(5) @ Z/(2). 


We include a table giving the number of groups of given order <32: 


ats|atsts|r{e]o| wf iu |] a} | is | 


number of 


groups 
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Of course, in fact the study of the structure of finite groups uses not just their 


order, but also other more precise invariants, and moreover uses methods of 
constructing more complicated groups from simpler ones. 
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We return to the general notion of group and describe the basic methods of 
constructing groups. One of these we have already used, namely direct product. 
By analogy with the case of two factors, we define the direct product G, x --- x 
G,, of any finite number of groups G,, ..., G,,: this consists of sequences 
(J15--:59m) With g; € G;, with multiplication defined component-by-component. 
The groups G, can be viewed as subgroups of the product G, x --- x G,,, by 
identifying g € G,; with the sequence (e,...,g,...,e) with g in the ith place. Under 
this the G; are normal subgroups of the product G, x --- x G,,, generating it and 
satisfying G,(G, x --- x G_, x ex Gj, xX +: x G,) =e. It is not hard to 
show that conversely, if a group G is generated by normal subgroups G; with the 
property that G; A (G,...G,_,G;,,...G,,) = e then G 1s isomorphic to the direct 
product of the G,. 

We have seen that for finite Abelian groups, or even for finitely generated 
Abelian groups, direct sum turns out to be a powerful construction, sufficient for 
a complete classification of these groups. However, for this it is essential that the 
direct sum decomposition appearing in the classification theorem (§6, Theorem 
IT) is unique. It is natural to ask whether for non-Abelian groups also, decomposi- 
tion as a direct product of groups which cannot be decomposed any further is 
unique. We give the answer to this in the simplest case, which will however be 
sufficient in most applications. By analogy with modules, we consider chains of 
subgroups 

G2H,>H,2::: 2 H,. (1) 


If the length of all such chains is bounded then G is a group of finite length. Finite 
groups are of this kind. If G is a Lie group or an algebraic group, then it is 
natural to consider in the definition only chains (1) with H; connected closed Lie 
subgroups or algebraic subgroups. Then the dimension of subgroups in a chain 
(1) must decrease, so that the length of the chain is bounded by the dimension 
of G. 


I. The Wedderburn-Remak-Shmidt Theorem. A group of finite length has one 
and only one decomposition as a direct sum of normal subgroups which cannot be 
decomposed any further. More precisely, any two such decompositions must have 
the same number of factors and the factors must be isomorphic in pairs. 


However, a non-Abelian group (for example a finite group or a Lie group) is 
only in exceptional cases decomposable as a direct sum: the great majority of 
them are indecomposable. A more universal method of reducing groups to 
simpler component parts is provided by the notion of homomorphism. If 
G’ = G/N then the homomorphism G — G’ allows us to view G’ as a kind of 
simplified version of G, obtained by considering G ‘up to elements of N’. Now, 
what is the extreme nontrivial extent to which a group can be ‘simplified’ in this 
way? We could consider a group homomorphism G’ > G", and so on. If G is of 
finite length then this process must stop, which happens when we arrive at a 
group G not having any nontrivial homomorphisms; this means that G does not 
have any normal subgroups other than {e} and G itself. 
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We say that a group G without any normal subgroups other than {e} and G 
itself is simple. In the case of Lie groups or algebraic groups it is natural to talk 
of connected normal Lie subgroups or algebraic subgroups. Thus a Lie group 
which is simple according to our definition may fail to be simple as an abstract 
group if it contains a discrete normal subgroup. An example is R, containing the 
subgroup Z. We have seen that every group of finite length has a homomorphism 
to a simple group. Let G — G’ be one such homomorphism and N, its kernel. It 
is then natural to apply the same construction to N,. We obtain a homomor- 
phism N, — G” to a simple group G” with kernel N, which is a normal subgroup 
in N, (but not necessarily in G). Continuing this procedure, we obtain a chain 
G=N~AN,2---AN,<1 Na, = {e}, in which the quotient groups N,/N;4, are 
simple. Such a chain is called a composition series for G, and the quotient groups 
N,/N,+, the quotients of the composition series. Of course, the same group may 
have different composition series, so that the following result is very important. 


II. The Jordan-Holder Theorem. Two composition series of the same group have 
the same length, and their quotients are isomorphic in pairs (but possibly occur in 
a different order). 


The proofs of the Jordan-Holder and the Wedderburn-Remak-Shmidt 
theorems make very little use of properties of groups. The basic fact which they 
use is the following. 


Lemma. If H is a subgroup of G and N a normal subgroup of G then HQ N is 
normal in H and 


H/HAN = HNN. (2) 


Here HN is the subgroup of G consisting of all products of the form hn with 
he HandneN. 


For consider the homomorphism f: G — G/N; the restriction of f to H defines 
a homomorphism f,: H ~ G/N with kernel HON and image H, = H/HAN. 
The inverse image of H, under fis HN,so that H, = HN/N, and (2) follows from 
this. 

The proof of both theorems is based on the idea that replacing the pair 
(H,H AN) by (HN,N) for various choices of H and N, we can pass from one 
decomposition of a group as a direct sum of indecomposable subgroups to 
another, or from one composition series to another. In essence these arguments 
use only the properties of the partially ordered set of subgroups of G, and in this 
form they can be axiomatised. This treatment is useful, in that it applies also to 
modules of finite length, and gives analogues of these two theorems for these 
(compare § 9, Theorem I]). 

Thus the problem of describing groups (finite groups, Lie groups, or algebraic 
groups) reduces to the following two questions: 

(1) Describe all groups having a given collection of groups as quotients of a 
composition series. 
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(2) Which groups can be the quotients of a composition series? 

The first question can be studied inductively, and we then arrive at the 
following question: for given N and F, describe all groups G having a normal 
subgroup isomorphic to N with quotient group isomorphic to F; in this case we 
say that G is an extension of F by N. For example, a crystallographic group G 
(§ 14, Example 2) is an extension of F by A, where A is the group of translations 
contained in G, consisting of translations in the vectors of a certain lattice C, and 
F is asymmetry group of C. 

Although in this generality the question is unlikely to have a complete answer, 
various approaches to it are known which in concrete situations lead to a 
more-or-less satisfactory picture (concerning this, see §21, Example 4). 

Much more intriguing is the second of the above questions. Since the quotients 
of composition series are simple groups, and any set of simple groups is a factor 
of a composition series of some group (for example, their direct sum), our 
question is equivalent to the following: 


WHAT ARE THE SIMPLE GROUPS? 


At the present time, the answer to this is known for the most important types 
of groups: finite groups, Lie groups and algebraic groups. 

We start with a case which would seem to be trivial, but which has a large 
number of applications, the simple Abelian groups. From the point of view of 
abstract group theory, the answer is obvious: the simple Abelian groups are just 
the cyclic groups of prime order. In the theory of Lie groups, we gave the 
definition of a simple group in terms of connected normal subgroups. Hence in 
the theory of connected differentiable Lie groups there are two further examples: 
the additive group R of real numbers, and the circle group. We will not give the 
answer for complex-analytic Lie groups, which is more complicated. Similarly, 
for algebraic matrix groups (or linear algebraic groups) over an algebraically 
closed field there are two new example: G,, the additive group of elements of the 
ground field, and G,,, the multiplicative group. 

This meagre collection of simple Abelian groups leads to a quite extensive class 
of groups when used as quotients in composition series, following the ideas 
described above. A group having a composition series with Abelian quotients is 
said to be solvable. It is easy to check the following properties: 


Theorem III. A subgroup or homomorphic image of a solvable group is solvable; 
if a group has a solvable normal subgroup with solvable quotient then it is solvable. 
If a group is solvable then it has a normal subgroup with Abelian quotient, that is, 
a nontrivial homomorphism to an Abelian group. 


For any group G, the intersection of the kernels of all the homomorphisms 
f:G—A of G to Abelian groups is called the commutator subgroup of G, and 
written G’. 

Obviously, for any elements g,, g, € G, the products g,g, and g,g, go into 
the same element under any homomorphism to an Abelian group, and hence 
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9192(9291) | = 919291'93' goes to the identity. An element of the form 
91929119 2° is called a commutator. We see that all commutators are contained 
in the commutator subgroup; it is not hard to prove that they generate it, so that 


G' = (919291 '92 191592 € G>. 


If G is solvable, then G’ ¥ G (provided that G # {e}). But as a subgroup of a 
solvable group, G’ is again solvable, so that either G’ = {e} or G” =(G’) #G’. 
Continuing this procedure shows that taking successive commutator subgroups 
of a solvable group, we eventually arrive at {e}; that is, if we set G? = (G°"Y, 
then G” = {e} for some n. It is easy to see that this is also a sufficient condition 
for a group of finite length to be solvable. Abelian groups are characterised by 
the fact that G’ = {e}. In this sense, solvable groups are a natural generalisation 
of Abelian groups. 


Example 2. Among the finite groups we have met, the following are solvable 
(in addition to Abelian groups): 

S,, with the composition series S; > W, > {e}. 

S,, with the composition series S, > UW, > B, > {e}, where B, is the sub- 
group consisting of e and all the elements of cycle-type (2, 2). 

GL(2, F,) and GL(2, F,). 

We say that a finite group is a p-group if its order is a power of a prime 
number p. 


Theorem IV. A finite p-group is solvable. 


In fact, consider the adjoint action of G on itself; its orbits are the conjugacy 
classes of elements of G, say C,,..., C,. Suppose that C, has k; elements, so that 
|IG| =k, +-::+ k,. Then as we have seen (§ 12, (10)), k; = (G: S,) where S, is the 
stabiliser of some element g; € C;. The stabiliser S; is a subgroup of G, so that 
(G: S;) divides the order of G, and hence is a power of p. In particular, k; = 1 if 
and only if C; consists of one element only, contained in the centre of G. In the 
equation |G| =k, +--:+k,, the left-hand side is a power of p, and the right- 
hand side is a sum of terms which are also powers of p (possibly equal to 1). It 
follows from this that the number of k; which are equal to 1 must be divisible by 
p. This proves that a finite p-group has a nontrivial centre Z. Since Z is Abelian, 
it is solvable, and by induction, we can assume that the quotient G/Z is also 
solvable. Hence G is solvable. 

We now give some examples of solvable Lie groups. 


Example 3. E(2), the group of all orientation-preserving motions of the 
Euclidean plane; this has the composition series E(2) > T > T’ > {e}, where T 
is the group of all translations, and T’ the subgroup of all translations in some 
given direction. The quotients of this composition series are E(2)/T =~ SO(1) = 
R/Z, T/T’ = Rand T’=R. 
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Example 4. The group of all upper-triangular matrixes of the form 
Gj, 412 """ Ain 


QO  az2 *** Agn ; 
, With a,,4 25... An, # 9, 


O . coe Ann 


where a,; belongs to some field K. This is an algebraic group; for K = R or C 
it is a Lie group; and for K a finite field it is a finite group. 

We return to the original question on the structure of simple groups and 
restrict ourselves to the nontrivial case of non-Abelian simple groups. For 
completely arbitrary groups there is of course no precise answer to our question. 
We now run through the non-Abelian groups we have met and say which of them 
are simple. 


Theorem V. The following series of groups are simple. 


(a) Finite groups: 


Y,, the alternating group, for n > S. 

PSL(n, F,), except for the case n = 2, q = 2 or 3. 
(b) The series Coop of classical compact Lie groups: 

SU(n) for n> 1; 

SO(n) for n # 1, 2, 4; 

SpU(n) for n > 1. 
(c) The series Lvee of classical complex analytic Lie groups: 

SL(n, C) for n > 1; 

SO(n, C) for n ¥ 1, 2, 4; 

Sp(2n, C) for n > 1. 


(d) The series W¢g,x of classical algebraic matrix groups (over an arbitrary 
algebraically closed field K). 


SL(n, K) for n > 1; 
SO(n, K) for n ¥ 1, 2, 4; 
Sp(2n, K) forn > 1. 
As we have already observed, groups of the series (b), (c) and (d) are not simple 
as abstract groups. They contain a nontrivial centre Z, and if Z, < Z is any 


subgroup in the centre of one of the groups G listed then G/Z, will also be a 
simple group in the sense of our definition; for example PSU(n) = SU(n)/Z, for 
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Z = Zo. In what follows we will without further mention include these trivial 
modifications in the same series Gomf, Leec and Wl gx. 

One of the greatest achievements of mathematics of modern times has been 
the proof that in the final three cases the examples given almost exhaust all simple 
groups. In various formulations of the problem this discovery, first made in the 
19th century, has been extended and made more precise, to cover all the cases 
considered (and also arbitrary differentiable simple Lie groups, not necessarily 
compact). This theory has an enormous number of applications. The discovery 
of the regular polyhedrons, corresponding to the finite subgroups of motions of 
space, is considered as the highest achievement of mathematics in antiquity— 
Euclid’s Elements ends with the description of the regular polyhedrons. These 
were the most profound symmetries discovered by mathematics in antiquity. The 
discovery and classification of the simple Lie groups occupies the same position 
in the mathematics of modern times: these are the most delicate symmetries 
accessible to the understanding of modern mathematics. And just as Plato 
considered the tetrahedron, the octahedron, the cube and the icosahedron to be 
forms of the four elements—fire, air, earth and water (leaving the dodecahedron 
as a symbol of the cosmos), so modern physicists attempt to find general laws 
governing the variety of elementary particles in terms of properties of various 
simple groups SU(2), SU(3), SU(4), SU(6) and others. 

We will not give a full statement of the result. It turns out that there exist 
exactly another 5 groups, the exceptional simple groups, denoted by Eg, E;, Eg, 
G, and F,, of dimensions 78, 133, 248, 14 and 52, which need to be added to the 
three series indicated above to provide a list of all simple groups (in each of the 
three series (b), (c) and (d)). The relation between the three resulting types of 
simple groups, the compact, complex analytic and matrix algebraic groups over 
a field K, is very simple for the latter two types: the complex groups arise from 
the algebraic groups when K = C. The relation between the complex and com- 
pact groups has in fact already been indicated in § 15: the compact groups are 
maximal compact subgroups of the corresponding complex analytic ones, and 
all maximal compact subgroups are conjugate. 

The classification of differentiable simple Lie groups is a little more com- 
plicated than the classification of just the compact ones, but conceptually it is 
just as clear. Each type of compact simple groups has a number of analogues in 
the noncompact case. Let us describe for example the analogues of the compact 
groups SU(n): these are the groups SU(p, qg) where UC P, q)is the group of complex 
linear transformations preserving the form |z,|? + --- + |z,|? — IZpeal? ite 
|Zp+q|. and the groups SL(n, R), SU(n, C) and SL(n/2, H) (ifn is even), considered 
as differentiable Lie groups. The final group SL(n, H) requires some explanation. 
Since the usual definition of determinant is not applicable to matrixes over a 
noncommutative ring, it is not clear what the notation S means. The answer 
consists of noting that SL(n,R) and SL(n,C) can also be defined in another 
way. That is, it is not hard to prove that SL(n,R) coincides with the com- 
mutator subgroup of GL(n, R), and SL(n, C) is obtained in the same way from 
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GL(n, C). Thus for SL(n, H) we take by definition the commutator subgroup of 
GL(n, H). 

We have not said anything so far about finite simple groups. Their classification 
is a very extensive problem. The brave conjecture that they should in some sense 
be analogous to the simple Lie groups had already been made in the 19th century. 
This relation is hinted at by the example of PSL(n, F,) which we have met. More 
concretely, the following approach is possible, which can at least provide many 
examples. For each simple algebraic matrix groups over the finite fields F,, we 
need to consider the group G(F,) consisting of matrixes with entries in F,. The 
simple algebraic groups G over F, are practically already known; if we replace 
[, by its algebraic closure K then the simple algebraic groups over K are provided 
by the theory we have just described: the list of groups og, and the 5 exceptional 
groups. However, it may happen that two groups defined over F, and not 
isomorphic over F, turn out to be isomorphic over K (or might no longer be 
simple over K). In essence, the same phenomenon has already appeared in the 
theory of real Lie groups, where the analogue of F, is R and of K is C. For 
example, all the groups SU(p,q) with p + q =n are algebraic groups over R 
(because each complex entry of the matrix can be given by its real and complex 
parts). But one can show that they all become isomorphic to each other and to 
SL(n, C) when considered over C. A similar situation occurs also for the fields 
F,. The question of classifying all algebraic groups over F,, assuming them 
known over the algebraic closure of F,, is essentially a matter of overcoming 
technical difficulties, and the answer to it is known. This leads to a list of simple 
algebraic groups G defined over F,, and for each of these we can construct a finite 
group G(F,). If we carry out this construction with the appropriate amount of 
care (for example, we should consider the group PSL(n, F,) rather than SL(®, F,)), 
then it turns out that all of these groups are simple, now as finite groups rather 
than as algebraic groups. There exist just a few exceptions corresponding to 
groups G of small dimension and small values of q (we have already seen this 
effect in the example of PSL(n, F,)). In this way one arrives at a number of series 
of simple finite groups, called groups of algebraic type. 

To the groups of algebraic type we must add one more series, the alternating 
groups Y,, for n > 5. Already in the 19th century, however, examples of groups 
began to appear which cannot be fitted into any of these series. But such examples 
always turned up as individuals, and not as infinite series. Up to now 26 such 
finite simple groups have been discovered, which are neither groups of algebraic 
type nor alternating groups W,. These are called sporadic simple groups. The 
biggest of these is of order 


24°. 379. 59-78. 117-133-17-19-23-29-31-41-47-59-71 


(not for nothing is it called the Monster). At the present time, it seems to have 
been proved that the groups of algebraic type, the alternating groups and the 26 
sporadic groups exhaust all finite simple groups. This is without doubt a result 
of the first importance. Unfortunately, it has been obtained as the result of many 
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years of effort on the part of several dozen mathematicians, and the proof is 
scattered over hundreds of articles, adding up to tens of thousands of pages. 
Hence a certain time probably has to elapse before this achievement has been 
accepted and digested by mathematicians to the same extent as the analogous 
classification of simple Lie groups and algebraic groups. 


§ 17. Group Representations 


We recall that a representation of a group G is a homomorphism of G to the 
group Aut L of linear transformations of some vector space L (see § 9); this notion 
is closely related to the idea of ‘coordinatisation’. The meaning of coordinatisa- 
tion is to specify objects forming a homogeneous set X by assigning individually 
distinguishable quantities to them. Of course such a specification is in principle 
impossible: considering the inverse map would then make the objects of X 
themselves individually distinguishable. The resolution of this contradiction is 
that, in the process of coordinatisation, apart from the objects and the quantities, 
there is in fact always a third ingredient, the coordinate system (in one or other 
sense of the world), which is like a kind of physical measuring instrument. Only 
after fixing a coordinate system S can one assign to a given object x e X a definite 
quantity, its ‘generalised coordinate’. But then the fundamental problem arises: 
how to distinguish the properties of the quantities that reflect properties of the 
objects themselves from those introduced by the choice of the coordinate system? 
This is the problem of invariance of various relations arising in theories of this 
kind. In spirit, it is entirely analogous to the problem of the observer in theo- 
retical physics. 

If we have two coordinate systems S and S’, then usually one can define an 
automorphism g of X (that is, a transformation of X preserving all notions 
defined in X) that takes S into S’: that is, g is defined by the fact that it takes each 
object x into an object x’ whose coordinate with respect to S’ equals that of x 
with respect to S. Thus all the admissible coordinate systems of our theory 
correspond to certain automorphisms of X, and it is easy to see that the auto- 
morphisms obtained in this way form a group G. This group acts naturally on 
the set of quantities: if ge G and gS = S’, then g takes the coordinate of each 
object in the coordinate system S into that of the same object in the coordinate 
system S’. If the set of quantities forms a vector space, then this action defines a 
representation of G. 

Let’s explain all this by an example. Consider an n-dimensional vector space 
L over a field K. Choosing a coordinate system S in L (that is, a basis), we can 
specify a vector by a set of n numbers. Passing to a different coordinate system 
is given by a linear transformation g e GL(n, K), which at the same time trans- 
forms the n coordinates by means of the matrix of the linear transformation. We 


§ 17. Group Representations 161 


obtain the tautological representation of GL(n, K) by matrixes. But if instead we 
take as our objects the quadratic forms, given in each coordinate system by a 
symmetric matrix, then passing to a different coordinate system involves passing 
from A to CAC*, and we obtain a representation of the group GL(n, K) in the 
space of symmetric matrixes, taking C e GL(n, K) into the linear transformation 
At+ CAC™. In exactly the same way, considering linear transformations instead 
of quadratic forms, we get another representation of GL(n, K), this time in the 
space of all matrixes, taking C € GL(n, K) into the linear transformation Ato 
CAC. Clearly, the same ideas apply to any tensors. In either case, quadratic 
forms or linear transformations, we are usually interested in properties of the 
corresponding matrixes that are independent of the choice of a coordinate 
system, that is, are preserved under the substitution A++ C*AC in the first case, 
or A++C ‘AC in the second. Examples of such properties are the rank of a 
matrix in the first case, and the coefficients of the characteristic polynomial in 
the second. 

A similar situation arises if the conditions of a problem admit a given symmetry 
(that is, they are preserved by some transformation g). Then the set X of all 
solutions of the problem should be taken to itself under the same transformation 
g; that is, the symmetry group of the problem acts on the set X, and this usually 
leads to a representation of the symmetry group. 

A beautiful example of this situation in given in [Michel 84 (1980)]. Consider 
the problem of how to construct a road network taking us from any vertex of a 
square ABCD into any other vertex, and of the shortest possible length. It is not 
hard to prove that the answer is given by the network shown in Figure 38, (a), 
in which the angles AED and BFC are equal to 120°. The square ABCD has the 
symmetry group D, (§ 13, Theorem III), but Figure 38, (a) obviously does not go 
into itself under this group! The explanation is that the given problem has two 
solutions, shown in Figure 38, (a) and (b), and that together these have the full 
D, symmetry: D, acts on the set of two figures (a) and (b). 


B A 
\ footy 
(a) (b) 


Fig. 38 


Here is another example which leads to a representation of a symmetry group. 
Suppose given a linear differential equation of the form 
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n qd" 
d alt) = 0, (1) 


with coefficients periodic functions of period 22. Then together with any solution 
f(t), the function f(t + 27k) is also a solution for ke Z, and the map f(t)h> 
f(t + 2xk) defines a linear transformation u, of the n-dimensional space of 
solutions. We obtain a representation of Z given by k+> u,, with u,,,; = u,u,, and 
hence u, = u’. A more complicated version of the same phenomenon involves 
considering equation (1) in the complex domain. Suppose that the a,(t) are 
rational functions of a complex variable t. If tp is not a pole of any of the functions 
a,(t) then near to, (1) has n linearly independent solutions, which are holomorphic 
in t. Analytically continuing these solutions along a closed curve s not passing 
through any poles of the a,(t), and with starting point and end point at to, we 
return to the same space of solutions; thus, as above, we obtain a linear trans- 
formation u(s), which defines an n-dimensional representation of the fundamental 
group m(C ~ P,,...,P,,), where P,,..., P,, are the poles of the a;(t). This repre- 
sentation is called the monodromy of equation (1). 

Another example. Suppose that a linear differential equation L(x,,..., 
x,,F) = 0 has coefficients which depend symmetrically on x,, ..., x,- Then 
permutations of x,,..., x, define a representation of the group G, in the space 
of solutions. This situation occurs in the quantum mechanical description of a 
system consisting of n identical particles. The state of such a system is given by 
a wave function W(q,,.--,4,), Where q; is the set of coordinates of the ith particle, 
and wy is determined up to a scalar multiple 4 with |A| = 1. A permutation o of 
the particles does not change the state, that is, it must multiply y by a constant. 
We get the relation (qo.1)5-- +s Vem) = 4(0)W(41,---+4n), from which it follows 
that A(o,0,) = A(o,)A(a,), that is, g++ A(a) is a 1-dimensional representation of 
the group G,. We know two such representations, the identity e(o) = 1, and the 
parity representation n given by n(c) = 1 for even and — 1 for odd permutations. 
It is easy to prove that there are no other 1-dimensional representations of ©,,. 
Thus 


either § W(qo.1)>---> om) = W(415---5n) for all a, 
or W(deuy> ney ony) = n(o)W(qi, vey Gn) for all 0. 


Which of these two cases occurs depends on the nature of the particles. We say 
that the particles are governed by Bose-Einstein statistics in the first case (for 
example, photons), and by Fermi-Dirac statistics in the second (for example, 
electrons, protons and neutrons). 

In §9 we defined the main notions of group representation theory: invariant 
subspace, irreducible representation, direct sum of representations, regular 
representation, the character of a representation. In the following, we consider 
representations over the complex number field of some of the main types 
of groups we have met: finite groups, compact Lie groups and complex Lie 
groups. 
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A. Representations of Finite Groups 


In §10 we obtained a series of theorems for finite groups as a particular 
case of the theory of semisimple algebras: the finiteness of the number of irreduci- 
ble representations; the theorem that the regular representation decomposes 
into irreducible factors, among which each irreducible representation occurs; 


h 
Burnside’s theorem |G| = ) n? (where n; are the ranks of the irreducible 
i=1 


representations of G); the theorem that the number h of irreducible representa- 
tions is equal to the rank of the centre Z(C[G]) of the group algebra of G; the 
fact that an irreducible representation is uniquely determined by its character. 

Consider first the case of Abelian groups. Let p: G > Autge L be an irreducible 
representation of such a group (whether G is finite or not plays no role). Then 
the transformations {}' «,(g)|g € G and «, € C} form an irreducible algebra in 
End¢ L, which by Burnside’s theorem (§ 10, Theorem XVII) must be the whole 
of Endc L. Since G is Abelian, the algebra Endc L is commutative, and this 
is only possible if the rank of the representation is 1. Thus, an irreducible 
representation of an Abelian group has rank 1. 

For finite Abelian groups, the same result is also clear from other considera- 
tions. The group algebra C[G] is commutative and hence by § 10, Theorem V, 
is isomorphic to a direct sum of fields C[G] = C". Its irreducible representations 
come from the projection onto the factors C of this decomposition, that is, if 
xX = (z,,...,Z,) then y;(x) = z;. We therefore have the following result. 


Theorem I. All irreducible representations of a finite Abelian group are 1- 
dimensional, and their number is equal to the order of the group. 


Thus irreducible representations of a finite Abelian group G are the same thing 
as homomorphisms y: G > C* = GL(1,C) to the multiplicative group of com- 
plex numbers; y coincides with its trace, and hence is called a character. 

Homomorphisms of any group G (not necessarily Abelian) into C* (or into 
any Abelian group) can be multiplied element-by-element: by definition, the 
product of characters y, and x, is the character y = 7, 7, defined by 


X(9) = X1(9)x2(9). (2) 


It is easy to see that under this definition of product, the characters of an Abelian 
group G themselves form a group, the character group of G, which is denoted by 
G. The identity is the character ¢ with e(g) = 1 for g € G; the inverse of y is the 
character y ‘(g) = x(g)'. It can be shown that for an Abelian group G, the 
character group G not only has the same order as G, but is isomorphic to G as 
an abstract group. However, there does not exist any ‘natural’ isomorphism 
between these groups. But the character group G of G is naturally isomorphic to 


G: the formula (2) shows that the map G > G given by g(y) = y(g) is a homo- 
morphism, which is an isomorphism, as one checks easily. The situation here is 
similar to the notion of the dual of a vector space. 
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The relation between G and its character group extends also to subgroups. 
Sending a subgroup H ¢ G to the subgroup H* c G of characters taking the 
value 1 on all the elements of H, we get a 1-to-1 correspondence between 
subgroups of G and those of its character group G. This correspondence is 
order-reversing: if H, <c H, then H*¥ > H*¥. Moreover, H* = (G/H)*. 

Characters satisfy a number of important relations. First of all, if y # ¢ (the 
identity character) then 

» x(g) = 0. (3) 

géeG 
Indeed, by assumption there exist gy € G such that y(g9) # 1. If we substitute gog 
for g in the left-hand side of (3), we see easily that on the one hand it remains 
unchanged, and on the other, it is multiplied through by (gq); this proves (3). 
For x = ¢, e(g) = 1 for all g € G, and hence 

> &(g) = |G. (4) 

geG 
Substituting y = 7; 7>' in (3), where y, and y, are two characters, we get from 
(3) and (4) that 


if x1 # X2> 
>» X1(9)X2(9) ae ify, = Yo. 


Since each element g 1s of finite order, the numbers y(G) are roots of unity, and 
so have absolute value 1. Hence y(g)"! = y(g), which allows us to interpret (5) 
as saying that the characters are orthonormal in the space of complex-valued 
functions on G, where we give this space the scalar product defined by 


(5) 


(f1.f2) = d filg)f2(Q). 


aid 


Thus the characters form an orthonormal basis, and any function on G can be 
written as a combination of them: 


f= Vie 
xEG 
where the ‘Fourier coefficients’ c, are determined by the formula 
Oe a. d f(9)x(9). 


Using the symmetry between a group and the character group (G = G), from (3) 
and (5) we get the relations 


»xu9)=0 forg #2, (4’) 
xEG 
and 
_ 0 if g, F Qo, 
x(91)x(g2") =| 5! 
2 MIM = igi itg, =9,, ©) 


§ 17. Group Representations 165 


One of the important applications of the character theory of finite Abelian 
groups relates to number theory. For G we take the multiplicative group (Z/(m))* 
of invertible elements of the ring Z/(m), that is, the group of residue classes 
a + mZ, consisting of numbers coprime to m. A character y of (Z/(m))*, with its 
definition extended to be 0 on noninvertible elements, can be viewed as a periodic 
functions on Z with period m. Such a function 1s called a Dirichlet character. The 
Dirichlet series 


y x(n) 


S 


L(s, x) = 


n>O n 


associated with these are one of the basic instruments of number theory. They 
form, for example, the basis of the proof of Dirichlet’s theorem, that if a and m 
are coprime, the residue class a + mZ contains an infinite number of prime 
numbers. In the course of the proof it becomes necessary to separate off the 


1 
partial sum )’— taken over the residue class a + mZ. Here we apply the relations 
n 
1 ___ 
(5’), from which it follows that this sum can be expressed as lm) > x(a) L(s, x), 
PIM) x 


where —(m) is the Euler function, the order of (Z/(m))*, and the sum takes place 
over all characters of the group. 

Proceeding to non-Abelian finite groups, we start with the problem of the 
number of their irreducible representations. According to §10, Theorem XIII, 
this equals the rank of the centre Z(C[G]) of the group algebra C[G] of G. It is 
easy to see that an element x = 2, f(g)g € C[G] belongs to the centre of C[G] 


(that is, x commutes with all u € “G) if and only if f(ugu"') = f(g) for all u, g € G. 
In other words, the function f(g) is constant on conjugacy classes of elements of 


G. Therefore, a basis of the centre is formed by the elements z- = )) g, where 
geCc 
the C are the different conjugacy classes. In particular, we obtain the result: 


Theorem II. The number of irreducible representations of a finite group G 
equals the number of conjugacy classes of elements of G. 


Can we find in the general case an analogue of the fact that the characters of 
an Abelian group themselves form a group? We have an analogue of the identity 
character, the identity 1-dimensional representation e(g) = 1 for g € G. One can 
also propose an analogue of the inverse element for a representation p, the 
so-called contragredient representation ((g) = p(g_')*, where * denotes the 
adjoint operator, acting on the dual space L* to the space L on which p acts. If 
p is a unitary representation (that is, all the p(g) are unitary with respect to some 
Hermitian metric; we saw in § 10 that such a metric always exists) then the matrix 
of the transformation f(g) is simply the complex conjugate of that of p(g). 

Finally, there also exists an analogue of the product of characters. We 
start with the case that two groups G, and G, are given, with two represen- 
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tations p,:G, > AutL, and p,:G,—AutL, of them on vector spaces L, 
and L,. Consider the tensor product L = L, @cL, of these vector spaces 
(see §5) and the map p of the direct product G, x G, of G, and G, defined by 
P(91,92)(X1 @ X2) = pi(gi)(%1) © p2(g2)(x2). It is easy to see that this 
defines a map p: G, x G, > Aut(L, ®c L,) which is a representation of G, x G,; 
it is called the tensor product of the representations p, and p,, and denoted 
by p; @ pp. 

For example, if G, and G, act on sets X,, X,, and L,, L, are certain spaces of 
functions on X,, X, preserved by these actions, with p,, p, the actions of G,, G, 
in L,, L,, then the representation p, ® p, acts on the space of functions on 
X, x X, spanned by f,(x,)f,(x,) with f, e L, and f, e L,. It is not hard to see 
that all irreducible representations of G, x G, are of the form p, ® p,, where p, 
is an irreducible representation of G, and p, of G,. All of these arguments carry 
over to representations of any semisimple algebras (and not just group algebras 
C[G}). 

Suppose now that G, = G, = G. Then we can construct a diagonal embedding 
go: G—G x G defined by o(g) = (g,g). (Here the construction uses the specific 
group situation in an essential way, and is not meaningful for algebras.) The 
composite (p, ® p,) ° @ defines a representation G > Aut(L, ®c L;) called the 
tensor product of the representations p, and p,. 

The essential difference from the Abelian case is the fact that for two irreducible 
representations p, and p, of G, the product p, ® p, may turns out to be reducible. 
Thus irreducible representations do not form a group: the product of two of them 
is a linear combination of the remainder. For example, the representations p and 
p acting on L and L* define a representation p ® 6 in L ®c L*. It is well known 
from linear algebra that L ®c L* is isomorphic to the space of linear transfor- 
mations End L (the isomorphism takes a vector a ® g € L @c L* into the linear 
transformation of rank 1 x» @(x)a). It is easy to see that in End L the represen- 
tation p @ p is written as a+ p(g)ap(g)' for « € End L. But transformations 
which are multiples of the identity are invariant under this representation, and 
hence the representation p ® f 1s reducible, having the identity representation 
among its irreducible factors. (It can be shown that for an irreducible represen- 
tation p the representation is the unique irreducible representation o such that 
the identity representation appears in the decomposition of p @ a into irreduci- 
ble representations—in this very weak sense # is still an inverse of p.) It is easy to 
check that if y, and x, are characters of representations p, and p, then the 
character of p, © p2 18 741 %2- 

Iterating this construction, for a group G we can consider the representation 
p®p®-::-@pinthe space L @® L@:-:: @ L. This is called the pth tensor power 
of p, and denoted by T?(p), where p is the number of factors. From this by 
factorisation we obtain the representation S’(p) in the space S’L and /\?(p) in 

PI. 

Ne each irreducible representation p,;, we choose a basis in which p,(g) can be 
written as unitary matrixes r;,(g) (such a basis exists by § 10, Example 3). 
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We introduce the scalar product 


1 _ 
Gl dL A@f 2(g) 


on the set of functions on G. In approximately the same way as for relation (5), 
one proves the result: 


(fi>f2) = 


Theorem III. The functions r;,(g) are mutually orthogonal and the square of the 


. 1 
modulus of rj, equals > where n, is the rank of the representation p;. 


i 
In particular, the characters form an orthonormal system of functions. 


Example 1. The group S, has two 1-dimensional representations, the identity 
representation e(c) and the representation y(a) which is +1 depending on the 
parity of o. Realising S, as the permutations of the set X = {x,,x ,x3} we get 
a rank 2 representation p, of S, on the space of functions on X with )° f(x) = 


xeX 


0, which is also irreducible. Since |G3| = 6 = 17 + 12 + 22, it follows from 
Burnside’s theorem that these are all the irreducible representations of S3. 


Example 2. The octahedral group O (§ 13, Example 4). The group O permutes 
the pairs of opposite vertexes of the octahedron, which defines a homomorphism 
O — S,. Hence the representations of ©, found in Example 1 give us certain 
irreducible representations of O. From the point of view of the geometry of the 
octahedron, these have the following meaning. We saw in § 13, Example 4 that 
O contains the tetrahedral group T as a subgroup and (O: T) = 2; then n(g) = 1 
for gé T and —1 for g ¢ T. The representation p, is realised in the space of 
functions on the vertexes of the octahedron taking the same value at opposite 
vertexes and with the sum of all values equal to 0. Furthermore, the inclusion 
O > SO(3) defines a 3-dimensional tautological representation p; of O. Finally, 
the tensor product p3, ® n (which in the present case reduces just to multiplying 
the transformation p;(G) by the number n(g)) defines another representation p3. 
From the point of view of the geometry of the octahedron, it has the following 
meaning. We saw in § 13, Theorem V that the group O(3) contains a subgroup 
OT isomorphic to the group of the octahedron, but not contained in SO(3). The 
composite O = OT — O(3) defines p. Since |O| = 24 = 17 + 17 + 27 + 3? + 32, 
we have found all the irreducible representations of the octahedral group. 


B. Representations of Compact Lie Groups 


Representations of compact Lie groups enjoy almost all the properties present 
for finite groups. At the root of all the properties of representations of finite 
groups is the fact that they are semisimple, and as we have seen in § 10, in its 
various forms (for representations over C or over arbitrary fields) this can be 
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deduced by one and the same idea: considering sums of the form )) F(g) for 
geEG 


various quantities F(g) related to elements of the group; that is, the possibility 
of summing or averaging over the group. Now this idea has an analogue in the 
theory of compact Lie groups. The corresponding expression is called the integral 
over the group. It takes any continuous function f(g) on a compact Lie group 
G into a number I(f)¢R called the integral over the group, and satisfies the 
following properties: 


I(f, + fa) = (fi) + Ufa), 
I(af) = al(f) foraeC, 
I(f)=1 if f=1 
I(|f|?) > 0 if f # 0, 
I(f,) =I(f) for i = 1, 2, 3, where f;(g) = f(g"), f2(9) = Flug) 
and f;(g) = f(gu) for ue G. 


The proof of the existence of J is based on constructing an invariant differential 
n-form w on G, where n = dim G. We then set 


up=e | Jo, 
G 


where c is chosen such that I(f) = 1 for f = 1. A form @ is invariant if t¥@ = @ 
for all ue G, where 1, is the transformation gt gu of G. An invariant form is 
constructed by the method described at the beginning of § 15: we need to choose 
an n-form w,e€ /\"T, on the tangent space T, to G at the identity e, and define 
the value of w, on T, as (t*)"* @,. 

The existence of integration on the group I(f) allows us to carry over word- 
for-word the argument of § 10, Example 3 to compact groups, and to prove that 
a compact subgroup G c GL(n,C) leaves invariant some Hermitian positive 
definite form @. Since this form is equivalent to the standard form ) |z;|’, we have 
G < CU(n)C™ for some C € GL(n, C), that is, G is conjugate to a subgroup of 
the unitary group. This result on compact subgroups of GL(n, C) was discussed 
without proof in §15, Example 8. In a similar way, a compact subgroup of 
GL/(n, R) is conjugate to a subgroup of O(n). Here is one famous application of 
the same ideas. 


Example 3. The Helmholtz-Lie Theorem. A flag F in an n-dimensional real 
vector space L is a sequence of oriented embedded subspaces L, ¢ L, c-'' 
L,-1 < L, with dim L; = i. If we introduce a Euclidean metric in L, then a flag 
corresponds uniquely to an orthonormal basis e,, ..., €é,, With L; = {e,,...,e;} 
(as oriented subspace). It follows from this that the manifold F¥ of all flags is 
compact. The group GL(n, R) acts on ¥: g(L,, L2,...,L,-1) = (g(L1), 9(L2),.--; 
g(L,,-1)), for g € GL(n, R) and (L,,L3,...,L,-,)€ F. 
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Theorem. Suppose that a subgroup G < GL(n,R) acts simply transitively on 
F (that is, for any two flags F, and F, there exists a unique transformation g € G 
for which g(F,) = F,). Then L can be given a Euclidean metric so that G becomes 
the group of orthogonal transformations. 


Indeed, since the stabiliser subgroup of a point of ¥ under the action of G 1s 
trivial, G can be identified with the orbit of any point, which by the transitivity 
of the action is the whole of ¥. Hence since ¥ is compact, so is G. From this, as 
we have seen, it already follows that there exists a Euclidean metric invariant 
under G. The fact that G is the whole of the orthogonal group of this metric 
follows easily from the fact that it acts transitively on F. 

The assertion we have proved is a local analogue and a first step in the proof 
of the famous Helmholtz-Lie Theorem, which gives an intrinsic characterisation 
of Riemannian manifolds of constant curvature. Namely, suppose that X is a 
differentiable manifold and G a group of diffeomorphisms which acts simply 
transitively on the set of points x e X and flags in the tangent spaces T,, (that 1s, 
for any two points x, ye X and flags F, in T;, F, in T,, there exists a unique 
transformation g € G such that g(x) = y, g(F,) = F,). Then X can be given a 
metric which turns it into one of the spaces of constant curvature, Euclidean, 
Lobachevsky, spherical or Riemannian (the quotient of the n-dimensional sphere 
by a central reflection), and G into the group of motions of the geometry. The 
assertion we have proved allows us to introduce on X a Riemannian metric, and 
then apply the technique of Riemannian geometry. 

The transitivity of the action of a group of motions on the set of flags is called 
the complete isotropy axiom of a Riemannian manifold. Thus Riemannian mani- 
folds satisfying this axiom are analogues of the regular polyhedrons (see § 13, 
Example 4) and conversely, regular polyhedrons are finite models of Riemannian 
spaces of constant curvature. 

It was apparently in the proof of the Helmholtz-Lie theorem that the role of 
the flag manifold ¥ was first understood. Subsequently it appears repeatedly: in 
topology, in the representation theory of Lie groups and in the theory of algebraic 
groups; and it always reflects the property we met above: it is the ‘best’ compact 
manifold on which the group GL(n, R) acts transitively. 

We now return to the representation theory of compact groups, which 1s also 
based on the existence of integration over the group. 


Theorem IV. (1) A finite-dimensional representation of a compact Lie group G 
is equivalent to a unitary representation and is semisimple. 

(2) In the space L?(G) of square-integrable functions f (for which I(|f |?) < o), 
we introduce the inner product 


(f,9) = I(f9). 


Then the analogue of the orthogonality relations holds word-for-word for the 
matrixes of the irreducible finite-dimensional representations (see Theorem III). 
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(3) Define the regular representation of G in the space L?(G) by the condition 
T,(f)(u) = f(ug). Then the regular representation decomposes as a direct sum of 
a countable number of finite-dimensional irreducible representations, and each 
irreducible representation appears in this decomposition the same number of times 
as its rank. A finite-dimensional representation is uniquely determined by its traces. 

(4) If a group G admits an embedding p: G > Aut(V) then all its irreducible 
representations are contained among those appearing in the decomposition of 
representations of the form T?(p) ® T4(p). 

(5) The irreducible representations of a direct product G, x G, are of the form 
P; ® pz where p, and p, are irreducible representations of G, and G3. 


The first and second assertions are proved word-for-word in the same way as 
for finite groups. The idea of the proof of the third can be illustrated if we assume 
that G is a closed subgroup of the group of linear transformations Aut(V) of a 
finite-dimensional vector space V (in fact such a representation of G is always 
possible, and for important examples of groups such as the classical groups, it is 
part of the definition). Then we have a tautological representation p: G > Aut(V) 
or G > GL(n, C), where n = dim V. If p(g) is the matrix (r;,(g)) then the values of 
the 2n? functions x, = Rer,(g) and y, = Imr,(g) determine the element g 
uniquely. By Weierstrass’ approximation theorem, any continuous function f on 
G can be approximated by polynomials in x, and y,,. But one sees easily that 
polynomials in x, and yj, coincide with linear combinations of the matrix 
elements of all possible representations T?(p) ® T4(6) or of their irreducible 
components. Therefore any continuous function on G can be approximated 
by linear combinations of functions rj,(g) corresponding to irreducible finite- 
dimensional representations p,, from which it follows easily that any function 
f€ L(G) can be expanded as a series in this orthogonal system. (3) follows 
easily from this. The proof gives even more, the information about irreducible 
representations of a group G contained in (4), (5). 

Note finally that the same properties (1)—(3) hold for any compact topological 
group. In this case, the ‘integral over the group’ I(f) is defined in a different way, 
by a beautiful construction of set theory. 


Example 4. Compact Abelian Lie Groups. Here all irreducible representations 
are 1-dimensional and we have a complete analogue of the character theory of 
finite Abelian groups. A compact Abelian group G has a countable number of 
characters, or homomorphisms y: G ~ C*. They are related by orthogonality 
relations analogous to (5), and any function fe L?(G) can be expanded as a series 
f = Y.c,: xin them. Characters form a (discrete) group G, and we have the same 


relation between subgroups of G and G as in the case of finite groups. In the 
particular case of the circle group G = R/Z, the group G is isomorphic to Z, since 
all characters of G are of the form x,(g) = e?”"®. The expansion f = )'c,7, is 
the Fourier series of f. This explains the role of the functions e?*"® (or sin 2xn@ 
and cos 2znq@) in the theory of Fourter series, as characters of G. 
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The theory takes on the most complete appearance if we extend it to the class 
of locally compact Abelian group. The relation between G and its character group 
G (which is also locally compact) is described by Pontryagin duality. 


Example 5. Let p: SO(n) > L be the tautological representation of the group 
SO(n) as the orthogonal transformations of an n-dimensional Euclidean space 
L. The representation S? p can be realised in the space of symmetric bilinear forms 
on L (identifying L with L* using the Euclidean structure of L). Then u € SO(n) 
acts on a function g(a, b) (for a, b € L) by taking it to p(p(u)~!a, p(u)"'b). In an 
orthonormal basis, g can be written as a symmetric matrix A, and the trans- 
formation law has the usual form A> p(u)Ap(u)! (since p(u)* = p(u)”'). 
Obviously, under this the identity matrix remains invariant. Hence S?p = 
I ® Sg p, where I is the identity representation and S} is the representation in 
the space of matrixes with trace 0. It is easy to see that SZ is irreducible. 

The following Examples 6-8 play a role in the theory of 4-dimensional 
Riemannian manifolds. 


Example 6. Consider in particular the case n = 4. Then SO(4) = (SpU(1) x 
SpU(1))/{ £(1, 1)} (by § 15, Formula (10)). Hence S¢ p can be viewed as a represen- 
tation of SpU(1) x SpU(1) and hence, in view of Theorem IV, (5), is of the form 
P; ® pz where p, and p, are representations of SpU(1); let’s find these represen- 
tations. The representation Sip acts, as in the preceding example, on the 9- 
dimensional space of symmetric bilinear forms (a, b) of trace 0, where now we 
can assume that a, be H are quaternions. Now ue SpU(1) x SpU(1) is of the 
form u = (q;,q2) where q,, q, € H with |q,| = |q2| = 1 and p(u)a = q,aq3' (§ 15, 
Formula (10)). Consider the action p, of SpU(1) on the 3-dimensional space H™ 
of purely imaginary quaternions given by p,(q): x» qxq ', for xe H™ and 
q € H with |q| = 1. Starting from two elements x, ye H™ construct the function 
Px, y(a, b) = Re(xayb) for a, b € H. It is easy to see that Px,y(a, b) = ~,, ,(b, a) and 
that the action xt q,xqi', y+» q>yq3' is equivalent to the transformation of the 
function ¢,. (a,b) under the representation S*p (we need to use Re(¢) = Re(é), 
Re(éy) = Re(ng), x = —x, y= —y and g=q"' if |q| = 1). Thus we have a 
homomorphism p, ® p, > S’p given by x ® y+, ,, where p, is the standard 
representation of SpU(1) (or even of SO(3)) in H~. It is easy to see that its kernel 
is 0. The image can only be Sip. Hence S2p = p,; ® py. 


Example 7. Again for n = 4, consider the representation /\7 p (where p is as in 
Examples 5—6), which we realise in the space of skew-symmetric bilinear forms 
on H. For x € H™ consider the bilinear form w,(a, b) = Re(axb). Then w,(b, a) = 
—w,(a,b) and xr, defines a homomorphism of representations (1 ® p,) > 
/\’ p. Similarly, sending x ¢ H™ into the form €,(a, b) = Re(a@xb) defines a homo- 
morphism (p, @ 1) > /\7p. Adding these, we get a homomorphism (p, @ 1) @ 
(1 @ p,) > /(\7p which one checks easily is an isomorphism, and defines a decom- 
position of /\7p into irreducible summands. 
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Example 8. Consider finally the representation S* /\” p. This is a representation 
of SO(4) in tensors with the same symmetry conditions as those satisfied by 
the curvature tensor of a 4-dimensional Riemannian manifold. According to 
Example 7, /\’p = (p; ® 1) ® (1 @ p,;) where p, is the tautological representa- 
tion of SO(3) in R>. It is easy to see that for any representations &€ and n of any 
group, S*(¢ ® 7) = S°E @ S*n @ (€ @ n). In particular, S?/\?p = (S7p, @ 1) @ 
(1 @ S?p,) ® (p; ® p,). By the result of Examples 5 and 6 we can write 


S? [7p = I @ 1) O (Sop: @ 1) (1 @N (1 @ Sop1) © SG, (6) 


which gives the decomposition of S? /\7p into irreducible representations. This 
shows which groups of the component of the curvature tensor can be picked out 
In an invariant way, so that they have geometric meaning. 

According to (6), we write an element € € S? /(\*p as 


C=AH +0,+fo+ 8, +7 


witha, Ee 1@ 1,a, € Sp, © 1,8) €1@1, B, € 1 @ SZp, andy € SZp. Since! @ 1 
and 1 @/ are 1-dimensional representations, a, and B) are given by numbers 
a, b; the so-called Bianchi identity shows that for the curvature tensor of a 
Riemannian manifold we always have a = b. The number 12a = 12b is called the 
scalar curvature, the symmetric matrix y (of trace 0) the trace-free Ricci tensor, 
and a, and £, the positive and negative Weyl tensors. 


Example 9. Irreducible representations of SU(2). SU(2) has the tautological 
representation p: SU(2) > Aut C* = Aut L. According to Theorem IV, (4), it 
follows from this that all irreducible representations of SU(2) are obtained among 
the tensor product of any number of copies of p and /. The representation / is 
equivalent simply to the complex conjugate of p, and in the present case, to p 
itself. In fact when dim L = 2 the space /\’L is 1-dimensional; choosing a basis 
Wo of /\*L gives a bilinear form @ on L, with x A y = @(x, y)@o, and thus 
establishes an isomorphism between L and the dual space L*. On the other hand, 
the Hermitian structure on L (contained in the definition of SU(2)) gives an 
isomorphism of L with its Hermitian conjugate L*. From these two isomor- 
phisms it follows that L* and L* are isomorphic, hence L and L, and hence the 
equivalence of p and #. 

Thus all irreducible representations of SU(2) are obtained from decomposi- 
tions of the representations T?(p) only. One set of representations 1s immediately 
obvious. The 2 x 2 unitary matrixes (or indeed any 2 x 2 matrixes) can be viewed 
as transformation matrixes of two variables x, y. As such they act on the space 
of homogeneous polynomials (or forms) of degree nin x and y 


° A F(x, y) > F(ax + yy, Bx + oy). 


This representation of rank n + 1 is denoted by p,, where j = n/2, following a 
tradition which arises in quantum mechanics (the theory of isotopic spin, compare 
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§18.E below). Obviously, p; = S*/p. If we take the homogeneous polynomial 
F(x, y) into the inhomogeneous polynomial f(z) = F(z, 1) then p,; can be written 


nce ( AZ +Y 

fle) (Bz + oyr( ira ‘), (7 
It is not hard to check that the p; are irreducible (this will become completely 
obvious in §17.C below). By what we have said above, to prove that the p, for 
j = 1/2, 1, 3/2, ... are all the irreducible representations of SU(2), we need only 
check that the representations T?(p) decompose as a sum of the representations 
p;; since p itself is p,., by induction it is enough to prove that p; © p, decomposes 
as a sum of representations p,. We can guess the rule for this decomposition if 
we consider the subgroup H of SU(2) consisting of diagonal matrixes. This is the 


group 
a QO 
Ja = F of with |o| = 1. (8) 


This is Abelian, and hence when we restrict to it, p,; decomposes into 1- 
dimensional representations. Indeed, a basis of the invariant 1-dimensional 
subspaces (realised in the space of forms) is given by the monomials 


Ve WS (9) 


on which H acts via the characters 7,(g,) = ©", Xn—2(Go) = @" 7, ---5 X-n(Ga) = 
a". On decomposing the restriction of p; ® p,; to H, we get the products in pairs 
of the characters occuring in the restriction of p,; and p, to H, that is, the character 
Xn+n ONCE, (where n = 2j, n’ = 2j’), Ani n'—2 CWICE, ¥n4n’-4 three times, and so on. 
From this one can easily guess that if p; © p,; decomposes as a sum of represen- 
tations p,, then this decomposition can only be of the form 


P; & p;: = Pj+j D Pjrj-2 BD Pyj-;. (10) 


To prove relation (10), we can use Theorem IV, (3), that is, the fact that a 
representation is uniquely determined by its trace. It is enough to show that a 
unitary matr:x is always diagonalisable, and hence conjugate to a matrix of the 
form (8), so that the character of the representation is determined by specifying 
it on such matrixes. In particular, for the representations p, we can find it easily 
(using the description of the action of g, in the basis (9)): 

zits __ oo (2itD) 
Xi Ga) = xX;(a) er are (11) 
a— oO 
Now (10) reduces to the simple formula 
XOX GAO) = Xj4jp(O) + Hj j—-2(%) + °° + Aj 7(@), 


which can easily be checked. Thus we have found the irreducible representations 
of SU(2) and have proved (10): this is called the Clebsch-Gordan formula. 
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Fig. 39 


Since SO(3) = SU(2)/(+ 1, — 1), the irreducible representations of SO(3) are 
contained among those of SU(2); they are just those on which the matrix —E 
acts trivially. Obviously this happens exactly when j is an integer. From (11) we 
get a formula for the characters of these representations: for a rotation g, through 
an angle 9, 


sin(2j + 1)e 


X(Go) = ae 


It is interesting that the method of restricting to an Abelian subgroup we have 
used (in the case of SO(3), this was the group of rotations around an axis) has 
a quantum-mechanical interpretation. Suppose that we have an electron in a 
centrally symmetric field. This symmetry should be reflected in the Hamiltonian, 
which must be SO(3)-invariant. Then the state space must be SO(3)-invariant, 
and hence must decompose as a direct sum of irreducible representations of 
SO(3). An irreducible subspace which occurs in the state space 1s defined by two 
numbers, of which one (the azimuthal quantum number) is j, and determines the 
type of the corresponding irreducible representation, and the other (the principal 
quantum number) distinguishes the different invariant subspaces corresponding 
to equivalent representations. All states occuring in one irreducible subspace 
must have the same energy level. 

If we switch on a magnetic field having rotational symmetry about an axis, 
then each irreducible representation of SO(3) restricts to the subgroup H < SO(3) 
of rotations about the axis of symmetry. The restriction to H of an irreducible 
representation of SO(3) decomposes, as we have seen, into 1-dimensional irreduc- 
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ible representations, and the states corresponding to different invariant sub- 
spaces with respect to the subgroup H already have different energy levels. This 
described the splitting of spectral lines in a magnetic field, the Zeeman effect. For 
example, Figure 39, which is taken from the article [F.J. Dyson, 37 (1964) J, gives 
spectrograms showing that the state space of an atom of the metal niobium 
transforms according to the representation p, of SO(3), which breaks up into 3 
representations of H after switching on the magnetic field. 


C. Representations of the Classical Complex Lie Groups 


In the following, we consider analytic representations of the classical groups, 
that is, we assume that the matrix of the linear transformation p(g) has entries 
which are complex-analytic functions of the entries of the matrix g € G. We use 
the relation between the classical complex and compact Lie groups, as described 
in §15. Each classical compact group is contained in some classical complex 
group as a maximal compact subgroup: U(n) in GL(n,C), SU(n) in SL(n, ©), 
SpU(n) in Sp(n, C) and so on. This connection makes it possible to study finite- 
dimensional representations of complex groups, starting from the information 
on representations of compact groups obtained in the preceding section. The first 
main result is as follows: 


Theorem V. Finite-dimensional representations of classical complex Lie groups 
are semisimple. 


We explain the idea of the proof using GL(n) as an example; we make 
one simplifying assumption: we suppose that in the representation p: GL(n) > 
Aut(L), the entries r;,(g) of the transformation matrix p(g) are rational functions 
of the entries of the matrix g (in fact, this is always the case, not just for GL(n), 
but for all classical groups). Suppose that L has a subspace M invariant under 
all transformations p(g) for g € GL(n). In view of the fact proved above that 
representations of the groups U(n) are semisimple, there exists a subspace N 
invariant under all p(g) with g e U(n) such that L = M @ N. Ina basis obtained 
by adjoining together bases of M and _N, the transformations p(g) for g e GL(n) 


have matrixes of the form 
“Ae a 
0 Bg) 


Here the entries c,,(g) of C(g) are rational functions of the entries of g by the 
above assumption, and are equal to 0 if g € U(n). Everything therefore reduces 
to the proof of the following lemma, which is not related to representation theory. 


Lemma. A function F(Z) which is a rational function of the entries of a variable 
matrix Z € GL(n) and takes the value 0 for all Z € U(n) is identically 0. 
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As is well known (and can be checked at once), any matrix Z with det(E + Z) # 


0 can be written as 
Z=(E-—T)(E+ T)", 


and Z is unitary if and only if T* = —T. Set F(Z) = G(T). Then our assertion 
reduces to proving that a rational function G(T) of the entries of a matrix T 
which is 0 on skew-Hermitian matrixes is identically 0. Set T = X + iY. The 
condition T* = —T takes the form X' = — X, Y' = Y, where t denotes trans- 
position. Our function is a rational function of the elements x,; (for i <j) of a 
skew-symmetric matrix and y,; (for i <j) of a symmetric matrix Y which are 
independent real variables. Since the rational function is 0 for all real values of 
the variables, it is identically 0, that is G(T) = 0 for all T of the form X + iY with 
X' = —X, Y' = Y, where X and Y are any complex matrixes. But any matrix T 


1 | 
can be represented in this form, setting X = 5 (TF + T') Y= 5 — T"). 


The proof of Theorem V is just one example of a general method of studying 
representations of classical complex Lie groups. This method is an analogue of 
the analytic continuation of real functions into the complex domain, and is called 
the unitary trick. Restricting a representation of such a group G to its maximal 
compact subgroup K, we get a representation of K. Conversely, from a represen- 
tation of K we get a representation of G if we let the real parameters on which 
a matrix k €e K depends take complex values. 

We thus get a 1-to-1 correspondence between representations of G and K. For 
example, the irreducible representations of SL(2, C) can be written in exactly the 


same formulas (7) with the only difference that the entries of the matrix b | 


now take any complex values satisfying «6 — By = 1. (Incidentally, the fact that 
they are irreducible follows at once from this.) Because of the relation between 
SL(2, C) and the Lorentz group, this description is important for physicists. 

All we have said so far might give the impression that the theory of represen- 
tations of the classical complex Lie groups is completely analogous to that of 
representations of compact groups. In actual fact, this is very far from the case; 
the theory of representations of any noncompact Lie group is related with certain 
completely new phenomena. 

As in the compact case, there exists an invariant differential n-form on a group 
G (where n = dim G), and using this we can define the regular representation of 
G in L?(G). The regular representation again ‘breaks up into irreducibles’, but 
now these words have a different meaning. The irreducible representations are, 
in general, infinite-dimensional, and depend on continuously varying param- 
eters, so that the situation arising here is of a ‘continuous spectrum’ type. The 
regular representation decomposes not as a sum, but as an ‘integral’ of irreducible 
representations. For example, the characters of the additive group R of real 
numbers are of the form 

y,(x) = e?"4* withde R, 
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and the ‘decomposition’ of the regular representation reduces to the represen- 
tation of functions as Fourier integrals (rather than as Fourier series, as in the 
case of the compact group R/2zZ). Both the regular representation and the 
irreducible representations into which it decomposes are unitary. It follows from 
this that in the majority of cases they cannot be finite-dimensional. For example 
if G is simple, then a nonidentical representation must be an embedding, and 
cannot be an embedding into some group U(n) since G 1s noncompact and U(n) 
is compact. As a rule, not all irreducible unitary representations occur in the 
decomposition of the regular representation. But even those which arise are not 
contained in the regular representation as subrepresentations: just as a point of 
the continuous spectrum of an operators does not correspond to any eigenvector. 
The exceptional cases when irreducible representations are contained in the 
regular representation are very interesting; they are an analogue of the discrete 
part of the spectrum. Of this type, for example are the representations (for n > 0) 
of the group SL(2, R) acting on the space of analytic functions f(z) in the upper 
(or lower) half-plane with the inner product 


(fifo) = |, fle F@yax Ady forz=x + iy, 


by the formulas 


Ty f0@) = (Be + ay 742) where g =| * a 


The very construction suggests that these are related to the theory of auto- 
morphic functions. 


§18. Some Applications of Groups 


A. Galois Theory 


Galois theory studies the ‘symmetries’, that is, the automorphisms of finite 
extensions; see § 6 for the definition of finite extensions and their simplest proper- 
ties. For simplicity, we will assume that the fields under consideration are of 
characteristic 0, although in fact all the main results hold in much greater 
generality, for example, also for finite fields. 

Every finite extension L/K is of the form K (a) where « is a root of an irreducible 
polynomial P(t) € K[t] (under the assumption that the fields have characteristic 
0), and the degree of P(t) is equal to [L: K]. Hence Galois theory can be treated 
in terms of polynomials (as Galois himself did), although a treatment in these 
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terms is not invariant, in the sense that different polynomials P(t) can generate 
the same extension L/K. 

An automorphism of an extension L/K is an automorphism o of the field L 
which fixes all the elements of K. All the automorphisms of a given extension 
form a group Aut(L/K) under composition. An automorphism o of an extension 
K(a)/K is uniquely determined by the element o(x) to which « is taken: any 


n-1 


element of K(«) can be written in the form )) a;,a' with a; € K, so if o(a) = B then 


i=0 
o(>a;a') = ) a,B'. On the other hand, if o(a) = B and P(«) = 0 with P(t) € K[t] 
then also P(f) = 0. Hence 


|Aut(L/K)| < deg P(t) = [L: K]. (1) 


The bigger the group Aut(L/K), the more symmetric the extension L/K. The 
limiting case is when equality holds in (1), that is, when 


|Aut(L/K)| = [L: K]. 


An extension L/K with this property is called a Galois extension. By what we 
said above, for this to happen it is necessary that the irreducible polynomial P(t) 
of which « is a root (if L = K(a)) factorises over L into linear factors; it can be 
proved that this is also sufficient. In §12 we gave an example of an extension 
which is not a Galois extension, and is even ‘maximally asymmetric’, in the sense 
that Aut(L/K) = {e}. The possibility of applying group theory to the structure 
of fields is based on the fact that Galois extension nevertheless provide sufficiently 
complete information. 


Theorem I. Every finite extension is contained in a Galois extension. 


It is not difficult to give a recipe for constructing a Galois extension L/K 
containing a given extension L/K: suppose that L = K(a), where P(«) = 0 with 
P(t) € K[t]; then we must do the following. Over L, write P(t) = (t — «)P,(¢) with 
P,(t) € L[t], then construct an extension L, = L(«,) with P,(«,) = 0, and pro- 
ceed in the same way until P(t) factorises into linear factors. Among all Galois 
extensions L/K containing a given extension L/K there exists a minimal one, 
contained in all others. 

For a Galois extension L/K, the group Aut(L/K) is called the Galois group of 
L/K; it is denoted by Gal(L/K). By definition 


|Gal(L/K)| = [L: K]. 


The Galois group of a finite extension L/K is defined as the Galois group of the 
smallest Galois extension L/K containing L/K; the Galois group of an irreducible 
polynomial P(t) ¢ K[t] is the Galois group of the extension L/K = K(a) with 
P(a) = 0. 

If L = K(a) with P(«) = 0 then the smallest Galois extension L/K containing 
L/K is obtained by successively adjoining to K the roots of the polynomial P(t), 
as described above. Any automorphism o € Gal(L/K) is determined by which 
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elements it maps the roots «; of P(t) to. On the other hand, it can only take them 
into roots of the same polynomial. Hence o performs permutations of the roots 
of P(t), and the whole group Gal(L/K) acts on the set of these roots. For example, 
for the ‘asymmetric’ extension L = Q(2/2) considered in §12, «= 3/2, and 
P(t)=t? —2 =(t — a)(t? + at + a); but the polynomial t? + at + «? does not 
have roots in Q(2/2). We set L = L(a,) where a? + aa, + a? = 0, and hence 


—] / 3 _ 
1= ait a. It follows from this that L = L(,/ —3), and any element 


of L can be written as € + 4./—3 with ¢, ne Q(3/2 ). Obviously, an auto- 
morphism o € Aut(L/Q) is determined by the values of o(3/2 ) and a(./ — 3); at 


the same time, (a(3/2 ))? = 2, so that 
o(.3/2) = ek 2 2 for k=0,1lor2, 


—1+./-—3 
where ¢ = To is a 3rd root of 1; and 


o(./—3) = +./—3. 


It is easy to verify that any combinations of these values for o(2/2 ) and o(,/ — 3) 
really define an automorphism of L/Q, so that |Aut(L/Q)| = 6, and the extension 
L/Q is Galois since [L:Q] = 6. Its Galois group acts on the roots of the 
polynomial t* — 2, and obviously gives any permutation of them, so that in this 
case Gal(L/Q) = S,. We write out explicitly the action of Gal(L/Q) on the roots 


of x? — 2 asa table (the roots are ordered as 3/2, e3/2 and e?.3/2), 


At the heart of Galois theory is a remarkable relation between the subexten- 
sions K < L'c L of a Galois extension L/K and the subgroups of its Galois 
group G = Gal(L/K). For any subgroup H c Gal(L/K), we write L(H) for the 
subfield of L made up of all the elements of L invariant under all automorphisms 
in H; and for a subfield L’ with K c L’ c L we write G(L’) for the subgroup 
Aut(L/L’) of Gal(L/K). 


II. Fundamental Theorem of Galois Theory. The maps H+> L(H) and L’t> 
G(L’) are mutually inverse; they define a 1-to-1 correspondence between the sub- 
groups H — Gal(L/K) and subfields L’ with K < L' c L. This correspondence 
reverses inclusions: H < H, if and only if L(H,) < L(A). Moreover, (L(H): K] = 
(G:H). For K < L' c L, the extension L'/K is Galois if and only if G(L') < G is 
a normal subgroup; in this case, Gal(L’/K) = G/G(L’). 
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The classic illustration of the methods of Galois theory 1s their application to 
the question of solving equations by radicals; the foundation of this is the natural 
interpretation of a radical L/a in Galois theory. 

Suppose that the ground field K contains all the nth roots of 1, that is, the 
polynomial x” — 1 splits into linear factors, and that x" — ais irreducible, so that 
[K ("/a) : K] =n. It follows from the above discussion that in this case K (."/a)/K 
is a Galois extension, and all of its automorphisms o € Gal(K ("/a)/K) = G are 
determined by the fact that o(./a) = e."/a, where é” = 1. In other words, setting 
o(.a)/ VE = (oc), we get a character of G, and this character is faithful (that 1s, 
its kernal is e), so that the group G is cyclic. The field K ("/a) as a vector space 
over K defines a representation of G, which must break up into 1-dimensional 
representations. In fact 


K(./a) = K @ K-(X/a) ® K-(X/a?’ ®-- ®@ K-(X/a)"", 


where o((¥/a)') = x'(a): (ay. Thus the radicals (ay correspond to a decom- 
position of the representation of the cyclic group G on L into 1-dimensional 
representations. This picture is reversible: suppose that L/K is an extension with 
cyclic Galois group G of order n; then in the same way as above, we should have 


L= Ka,@°:'® Ka, with o(a,) = 7"(o):4,, 


where y(o) is a character which is a generator of the character group of G. It is 
not hard to deduce from this that a, = ajc, with c,é K,a,¢ Landaj=aeK 
so that L = K(./a). 

Thus if all the nth roots of 1 are contained in K, a radical extension K (."/a) is 
precisely an extension with a cyclic Galois group. From this, using Theorem II 
and very simple properties of solvable groups, one can prove the following result. 


Theorem III. An extension L/K is contained in an extension field A/K that can 
be obtained by successively adjoining radicals (that is, such that 


A=A,>A,>°°'> A, = K where A,_; = A(./A,) with A; € A)) 
if and only if its Galois group is solvable. 


It was this result that led to the notion of solvable groups, and indeed to the 
term itself. 

Consider, for example the field of rational functions k(t,,...,t,) = L and the 
subfield K of symmetric functions. As is well known, K = k(o,,...,6,) where o; 
are the elementary symmetric functions, and o;+» y; defines an isomorphism of 
k(o,,...,0,) with k(y,,...,y,). Obviously, Gal(L/K) = S,, and consists of all 
permutations of the variables t,, ..., t,. But the t; are roots of the equation 
x" —o,x"14+---+6, =0. Applying the isomorphism o;+> y;, we can say that 
the Galois group of the equation 


x"— yyx" +--+ y, =0 (2) 
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over the field k(y,,...,y,), where y,, ..., y, are independent variables, is the 
symmetric group G&,,. 

The equation (2) is called the generic equation of degree n. 

Putting together the criterion of Theorem III and well-known facts about the 
structure of the groups G,, (§ 13, Theorem I), we get the following result. 


Theorem IV. The generic equation of degree n is solvable in radicals for n = 2, 
3, or 4, and not solvable for n > 5. 


The structure of the formulas for solving equations of degrees n = 2, 3 and 4 
in radicals can also be predicted from properties of the groups S, for n = 2, 3 
and 4 (§ 13, Theorem J). 


B. The Galois Theory of Linear Differential Equations 
(Picard-Vessiot Theory) 


We consider a differential equation 
ye + ayy" +> + ayy = 0, (3) 


with coefficients meromorphic functions of one complex variable in some 
domain. Write K for the field C(a,,...,a,) and L for the field of all rational 
functions of a,,..., a, and of the n linearly independent solutions of (3) and all 
of their derivatives. 

A differential automorphism of the extension L/K is an automorphism of L 
which leaves fixed the elements of K and commutes with differentiation of 
elements of L. The group of all differential automorphisms of L/K is the differen- 
tial Galois group of L/K or of the equation (3). 

Since a differential automorphism commutes with differentiation and leaves 
fixed the coefficients of (3), it takes solutions of (3) into other solutions. Since 
solutions of (3) form an n-dimensional vector space, the differential Galois group 
of (3) is isomorphic to some subgroup of GL(n, C). It can be proved that this 
subgroup Is an algebraic matrix group (see § 15). In this way there arises a version 
of Galois theory in which finite extensions are replaced by differential extensions 
of the type considered above, and finite groups by algebraic groups. This version 
also has a complete analogue of the fundamental theorem of Galois theory. It 
turns out that the analogue of solvability by radicals is solvability by quadratures. 
For example, y = { a(x) dx is a solution of the equation y” — (a’/a)y’ = 0. In this 
case the differential automorphisms are of the form yr y + c with ce C, and 
hence the Galois group is isomorphic to G, (the group of elements of K under 
addition, see § 15). The functions y = exp({ a(x) dx) is a solution of y’ — ay = 0. 
Differential automorphisms of this are of the form y+>cy with c € C*, so that 
the differential Galois group is isomorphic to G,, (the group of elements of K 
under multiplication, see § 15). 

By analogy with classical Galois theory, we have the following result. 
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Theorem V. The roots of (3) can be expressed in terms of its coefficients by means 
of rational operations, taking integrals, exponentials of integrals and solving 
algebraic equations if and only if the differential Galois group has a normal series 
with quotients G,, G,, and finite groups. 


For example, it is not hard to find the Galois group of the equations y” + xy = 
Q; it is just SL(2, C). Since this group has { + E} as its only normal subgroup, and 
the quotient group PSL(2, C) = SL(2, C)/{ + E} is simple, the equation y” + xy 
cannot be solved by quadratures. 


C. Classification of Unramified Covers 


Let X be a connected manifold, x, € X a marked point, and G = n(X, x) the 
fundamental group of X; a manifold Y with a marked point y,e¢ Y and a 
continuous map p: Y > X such that p(yo) = Xo 1s a finite-sheeted cover if every 
point x € X has a neighbourhood U whose inverse image p‘(U) breaks up as a 
disjoint union of n open subsets U; such that p: U; ~ U is a homeomorphism for 
each i. The number n is the same for every point x, and equals the number of 
inverse images of x; it is called the degree of the cover. 

A map p: Y > X defines a homomorphism p,: 2(Y, yo) > m(X, Xo) in a natural 
way, taking a map g: I - Y that defines a loop into its composite with p. Write 
G(Y) for the image of p,. We have the following analogue of the fundamental 
theorem of Galois theory. 


Theorem VI. The map p: (Y, yo) G(Y) defines a 1-to-1 correspondence between 
connected unramified covers of finite degree and subgroups of G of finite index. 
The degree of a cover (Y,V.) +~(X,X 9) equals the index (G: G(Y)). A chain of 
covers (Z, Z9) > (Y, Yo) > (X, Xq) corresponds to an inclusion G(Z) < G(Y) of sub- 
groups. If G(Y) is a normal subgroup of G then the quotient group F = G/G(Y) 
acts on Y without fixed points, permuting the inverse images of points of X, and 
X = F\Y. 


The analogy with the fundamental theorem of Galois theory is so strong that 
one feels the desire to try to establish some kind of direct connections. In some 
cases this is in fact possible. Suppose that X is an algebraic variety over the 
complex number field, and p: Y— X an unramified cover. Using the local 
homeomorphism p: U;— U between open sets U; < Y and U c X, which exists by 
the definition of unramified cover, we can transfer the complex structure from X 
to Y. Thus Y has a uniquely defined structure of complex analytic manifold. It 
can be proved that with this structure Y is isomorphic to an algebraic variety. 
We arrive at a situation which is analogous to that considered at the end of § 6. 
If X and Y are irreducible, the map p: Y — X induces a map p*: C(X) > C(Y) 
on the corresponding rational function fields, and p* is a field homomorphism, 
hence an inclusion of fields. Thus C(X) < C(Y), and it can be proved that this is 
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a finite extension, with [C(Y): C(X)] = (G: G(Y)). We see that G provides us 
with a description of certain finite extensions of the field C(X). However, these 
are not all extensions of C(X), but only the ‘unramified’ ones. A general finite 
extension L/C(X) is also of the form L = C(Y), where Y is an algebraic variety, 
and there exists a map p: Y > X inducing the inclusion of fields C(X) < C(Y); 
but in general the map p is ramified in some subvariety S c X, that is, forx eS 
the number of inverse images p~‘(x) is smaller than the degree of the extension 
C(Y)/C(X). A description of such extensions can be found by similar methods if 
we consider the group 2(X ~ S). 

In the case that X is a compact complex irreducible algebraic curve, the space 
X is homeomorphic to an oriented surface. If the genus of this surface is g then 
m(X) has 2g generators x,,..., X2, with the single defining relation 


-1\.-1 -1\-1 -1 y-i 
X_X_Xy XQ NZX qXz NXg... Xyg-1XygX7g-1X2g = 1 (4) 


(see § 14, Example 7). Thus the subgroups of finite index of this explicitly defined 
group describe the unramified extensions Y > X. 

We mention a result which is similar in outward appearance. Let K be a finite 
extension of the p-adic number field Q, (§ 7, Example 7), containing the pth roots 
of 1. Suppose that K contains a primitive p°th root of 1, but not the p°*' th roots 
of 1, and that p°® # 2. Setn = [K: Q,]. 


Theorem VII. Finite Galois extensions L/K of K for which [L: K] is a power 
of p are in 1-to-1 correspondence with normal subgroups of index a power of p of 
the group with n + 2 generators 6,,..., 0,4 and the single defining relation 


pe —-1,-1 —1 —1 
Of 01020, 02 ---On+1On+29n41In+2 = | (5) 
(n is necessarily even). 


Despite the amazing similarity between relations (5) and (4), the reason behind 
this parallelism is far from clear. 


D. Invariant Theory 


Let G = GL/(n, C) be the group of linear transformations of an n-dimensional 
vector space L, and T(L) a space of tensors of some definite kind over L. This 
set-up defines a representation g: G > Aut T(L) of G in T(L). Of special interest 
are the polynomial functions F defined on T(L) and invariant under the action 
of G: these express intrinsic properties of tensors of T(L), independent of a choice 
of coordinate system in L (if we interpret elements g e GL(n, C) as passing to a 
different coordinate system). It is convenient to weaken somewhat this require- 
ment: an intrinsic property of a tensor is often expressed by the vanishing of 
some polynomial which gets multiplied by a constant under the action of an ele- 
ment g: 
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P(g) F = c(g)F. (6) 


It is easy to see that c(g) is some power of the determinant of g, and (6) is 
equivalent to the condition that @(g)F = F for g € SL(n, C). Polynomials with 
this property are called invariants of SL(n, C). For example, if T(L) = S?(L) is the 
space of quadratic forms then the discriminant of a quadratic form is an invariant. 

In what follows we consider the simplest case, when T(L) = S™(L) is the space 
of forms of degree m over L. The ring A = C[S™(L)] of all polynomials of the 
coefficients of forms contains the subring B of invariants. We illustrate an 
application of the main facts of group representation theory by giving the proof 
of one of the fundamental results of invariant theory. 


VIII. First Fundamental Theorem. The ring of invariants is finitely generated 
over C. 


This is based on the following simple lemma. 
Lemma. If I < B is an ideal of the ring of invariants then IAQ B = I. 


The proof is related to the fact that the polynomial ring 1s graded (compare 
§6, Theorem V): A = Ag @ A; ® A, @-::; and B is also graded as a subring of 
A. Each of the space A; defines a representation of G = SL(n, C). Every element 
ae A is contained in some finite-dimensional invariant subspace A c A (for 
example, in G) A;, where n = dega). Since representations of this group are 

ign 
semisimple (§ 17, Theorem V), A splits as a direct sum A = A @ B, where 4 is the 
sum of all the irreducible representations appearing in A distinct from the trivial 
representation, and B cB; then A is a finite-dimensional subspace invariant 
under G such that An B= 0. Leta =a+), withae A and be B. For xe IA, 
let x = ¥ i,a, with i, I and a, A. Setting a, = a, + b, with a, € A, and b, € B, 
l 


where A, is the subspace corresponding to A for the elements a,, we get x = 
ib + > i,4,. But )°i,b, € I and i,A, is an invariant subspace in which a repre- 
sentation isomorphic to A, is induced. Hence )°i,a,¢ 5 \i,A,. From the basic 
properties of semisimple modules (see § 10, Theorem I) it follows that the module 
y i, A, contains only simple submodules isomorphic to one of the i, A,. In particu- 
lar(>\i,A}) 0 B = 0, and ifx € Bthen )\i,a, = Oand x € I. This proves the lemma. 

The fundamental theorem is now obvious. Sending I to JA is a map of ideals 
of B into ideals of A, and by the lemma, distinct ideals go into distinct ideals. But 
then A Noetherian implies that B is Noetherian; and a graded Noetherian ring 
is finitely generated over By = C (§ 6, Theorem V). 

It was specifically for the proof of this theorem that Hilbert introduced the 
idea of a Noetherian ring, and proved the Noetherian property of the polynomial 
ring (although this may sound absurd, since Emmy Noether, in whose honour 
the term Noetherian was subsequently introduced, was still a baby at the time 
Hilbert published his work on invariant theory). 
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E. Group Representations and the Classification of 
Elementary Particles 


In the last two decades, a great deal of enthusiasm on the part of theoretical 
physicists has gone into attempts to use the representation theory of Lie groups 
to work out a unified point of view on the enigmatic picture of the elementary 
particles, which have been discovered up to the present in large numbers. Of the 
three types of interactions considered in physics, electromagnetic, weak and 
strong, we will only discuss strong interactions, which relate to nuclear forces. 
Particles taking part in strong interactions, the hadrons, are mesons (intermediate 
particles) and baryons (heavy particles). 

We start with the remarks given at the end of our treatment of § 17, Example 
9. The three spectral lines of Figure 39 (triplets in physical terminology) arose as 
a result of the violations of the original symmetry with respect to SO(3), which 
reduces to the subgroup H = SO(2). The restriction of the 3-dimensional repre- 
sentation p, of SO(3) to the subgroup H is no longer irreducible, and decomposes 
as three 1-dimensional representations, which correspond to the observed lines. 
This picture, as a model, will be the basis of all the ideas we discuss in what 
follows; if we have a set of r particles with similar properties, we can attempt to 
represent the set as a degeneration, related to a lessening of symmetry. Mathe- 
matically, this relates to the fact that the states of all the particles under considera- 
tion form spaces L,, ..., L,, in which one and the same group H acts by 
representations p,,..., p,. We look for a bigger group G > H having an irreduci- 
ble representation in L, ®-:-@L, in such a way that the restriction of this 
representation to the subgroup H 1s equivalent to p, ®::: © p,. 

The first step is to consider the pair consisting of the proton p and neutron n. 
The proton and neutron have the same spin, and very close (but not identical) 
masses: 


mass of proton = 938.2 MeV = 1.6726 x 10°74 gm 
mass of neutron = 939.8 MeV = 1.6749 x 10°** gm. 


They have different charges, but this only manifests itself in considering electro- 
magnetic interactions, which are ignored in the present theory. In this context, 
Heisenberg proposed already in the 1930s to consider the proton and neutron 
as two quantum states of one particle, the nucleon. Correspondingly, they will 
be denoted by N* and N°, and the nucleon by N. According to general principles 
of quantum mechanics, we get as the state space of the nucleon a 2-dimensional 
complex space L with a Hermitian metric: N* and N° correspond to a basis of 
L. The symmetries of this space form the group U(2). From now on we consider 
its subgroup SU(2), which has very similar properties, and omit the physical 
arguments which justify this restriction. In a system consisting of many nucleons, 
the state space will be of the form L®L®:::@L; the group SU(2) has a 
representation on this. The (tautological) representation of SU(2) on L was 
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denoted (according to the classification given in §17, Example 9) by p,;.. The 
representation in the state space of the new system will be T?(p,,.). We know by 
§ 17, Example 9 that all the irreducible representations of SU(2) are of the form 
p;, where j > 0 is an integer or half-integer, and dim p; = 2j + 1. Moreover, the 
Clebsch-Gordan formula § 17, (10) allows us to decompose the representation 
p; ® p; into irreducible representations. Hence we can find the decomposition 
into irreducibles of our representation T?(p,,.). This gives a lot of physical 
information. The point is that by the quantum-mechanical dictionary of § 1, the 
probability of passing from a state given by a vector @ to a state given by w is 
equal to |(@, w)| (assuming that |g| = |w| = 1). But the decomposition of a repre- 
sentation into irreducible representations can always be taken to be orthogonal, 
and this means that states which are represented by vectors transforming ac- 
cording to different irreducible representations, cannot pass into one another: 
this is the so-called law of forbidden interactions. Furthermore, in addition to the 
indexes j of the irreducible representations p, into which T?(p,,2) splits, we can 
also write down actual bases of the subspaces in which these representations are 
realised, that is, we can transform the matrix T?(p,,.(g)) into a direct sum of 
matrixes p,(g). This allows us to find the probability of passing between different 
states of the system. 

Now it is natural to apply the same train of thought to the study of other 
elementary particles. It turns out that these turn up in groups of 2, 3 or 4, and 
the masses within one group are very close (although ‘lone’ or singlet particles 
also occur). Among the baryons we have for example, in addition to the nucleons 
already considered, a singlet A-hyperon (of mass 1115 MeV), a 2-doublet (that 
is, two particles Z* and =” of masses 1314 and 1321 MeV) and a 2-triplet (that 
is, three particles Y*, 2°, Y~ of masses 1189, 1192 and 1197 MeV). The same 
happens with mesons: there are singlets 7-meson, g-meson and w-meson, the K 
and K*-doublets and the doublets of their antiparticles, and x and p-triplets. It 
is natural to apply the same ideas to these, proposing that particles of one group 
are quantum states of the same particle, whose quantum states are 1-dimensional 
for singlets, 2-dimensional for doublets, and 3-dimensional for triplets, and 
correspond to representations fo, p12 and p, of SU(2). In the case of several 
particles the tensor products of representations again arises, which split into 
irreducibles according to the Clebsch-Gordan formula. All of these arguments, 
working in the framework of the group SU(2), form the theory of isotopic spin 
or isospin: a state which transforms under the irreducible representation p; of 
SU(2) is assigned isotopic spin j. This theory has justified itself very well; thus 
the existence of the triplet of z-mesons was predicted on the basis of this theory, 
and these were subsequently discovered. 

The boldest step of all is the following. In order to be consistent, we should 
apply the same arguments to all baryons. They form an octet, that is, there are 
8 of them: the singlet A, the doublets N and & and the triplet 2. We should 
propose that their varied nature arises only from violation of some higher 
symmetries. Physicists put things differently: they propose that there exist an 
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idealised ‘superstrong interaction’ with respect to which all properties of these 
particles are identical. For mathematicians, this is the problem of finding some 
group G containing a subgroup H isomorphic to SU(2), and an 8-dimensional 
irreducible representation p of G such that the restriction of p to H splits into po 
(corresponding to A), py (corresponding to N), another copy of 4/2 (corre- 
sponding to =) and p, (corresponding to 2’). 

Such a group and representation do exist, namely G = SU(3) with its adjoint 
representation on the space M§(C) of all 3 x 3 matrixes of trace 0, where 
the matrix ge SU(3) defines the transformation x+>g™'xg for x € M9(C); 
dime M3(C) = 9, and hence dime M9$(C) = 8. We write out this representation 
in matrix form; write a matrix x € M,(C) in the form 


where A is a 2 x 2 matrix, Ba 2 x 1 matrix, Cal x 2 matrix and « a number. 
The condition Tr x = 0 means that « = —Tr A. In SU(3), consider the subgroup 


U 0 
H of matrixes of the form ( 0 ) Then obviously, U e SU(2), so that H 1s 


isomorphic to SU(2). Since 


U 0\'/A B\/U 0 U-'AU a) 

(| (< \( "= ( CU a )’ 
the subdivision of the matrix into blocks A, B, C and « gives a splitting of 
the adjoint representation into two 2-dimensional representations and a 4- 
dimensional representation. The 2-dimensional representations obviously coin- 
cide with p,,.. The 4-dimensional representation is reducible, since M,(C) splits 


into scalar matrixes plus matrixes of trace 0. As a result of this, the 4-dimensional 
representation splits into a 1-dimensional representation, consisting of matrixes 


OL 0 ae ; 
of the form ( 0 —oa 5 ) and a 3-dimensional one, consisting of matrixes 
oL 


A 0 
( 0 ) with Tr A = 0. This is the required decomposition. 


To return from the idealised picture, described by representations of SU(3), to 
the real baryons is a problem of perturbation theory. Writing the perturbed 
Hamiltonian on the basis of heuristic, but natural, considerations leads to a 
situation in which the answer depends on two arbitrary constants. A suitable 
choice of these two constants allows us to get a good approximation for the 
known masses of the four groups of baryons (4, N, 2,2’). Moreover, the same 
approach has turned out to be applicable to mesons, in which there are two 
distinguished octets: the pseudoscalar mesons (ny, K, their antiparticles and z) 
and vector mesons (9, K*, their antiparticles and p), leading to the same problem 
of representation theory. 
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A new situation arises in considering the other group of baryons, which are 
also classified by isotopic spin. These are the doublet of £*-hyperons, the triplet 
of X*-hyperons and the quadruplet of 4-hyperons. According to the ideology 
discussed above, the problem is to find a 9-dimensional representation of SU(3) 
whose restriction to SU(2) splits as p12 ® p, ® p32. There is no such representa- 
tion. However, there exists a ‘nearby’ representation of SU(3), namely S%p, the 
third symmetric power of the tautological representation p of SU(3) in 3-space. 
If p acts on the space of linear forms in variables x, y, z, then S*p acts on the 
10-dimensional space L of homogeneous cubic polynomials in x, y, z. Consider 
the subgroup H < SU(3) isomorphic to SU(2) which fixes z, and acts on x and 
y by the tautological 2-dimensional representation p,,. of SU(2). Then we have 
the decomposition L = L, ® L,z ® L,z? ® Cz?, where L,; is the space of homo- 
geneous polynomials of degree i in x and y (so that dim L; = i+ 1). We get a 
decomposition of S*p restricted to H: 


P32 © P1 @ P12 ® Po. 


This differs from what we wanted by the summand fp. It is natural to propose 
that this summand corresponds to yet another particle, which would have to be 
included in our family of baryons. From group-theoretical considerations we can 
predict certain properties of this particle, for example its mass. A particle of this 
kind has indeed been discovered: it is called the Q” -hyperon. 

Finally, we can attempt to make sense of all these ideas starting from general 
properties of representations of SU(3). We know (§ 17, Theorem IV, (4)) that all 
representations of this group can be obtained from the irreducible decomposition 
of arbitrary tensor products of two representations: the tautological representa- 
tion p and the contragredient representation # (for SU(3), as opposed to SU(2), 
p is not equivalent to p). 

Hence the question arises: don’t these elementary representations correspond 
to certain ‘even more elementary’ particles? These conjectural particles are called 
quarks and antiquarks; their existence is supported by a series of experiments. 

Many very important questions remain, however, beyond the reach of a theory 
based on SU(3). For this reason one considers also symmetries based on intro- 
ducing other groups, for example SU(6). Similar ideas have been widely devel- 
oped over the last twenty years, finding applications also outside the domain of 
strong interactions. But at this point the author’s scant information on these 
matters breaks off. 
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A. Lie Algebras 


Natural and important algebraic systems having all the properties of rings 
with the exception of the associativity of multiplication appeared very long ago, 
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although the algebraic nature of these objects did not immediately become 
apparent. In §5 we gave a description of verter fields on a manifold as first order 


linear differential operators D(F) = > ra , or derivations of the rings of func- 


tions on manifolds, that is, maps 9: A > “ ‘such that 
Ya + b) = Ha) + Hb), 


GY(ab) = aB(b) + bHa) (1) 


and 


QY(a) = 0 for constants a. 


The composite 9, J, of two differential operators is of course again a differential 
operator, but if Z, and Y, are first order operators, then Y,Q, is a second 
order operator, since second derivatives will appear in it (this is especially 
clear for operators with constant coefficients: because of the isomorphism 
R| 5 , a ~ R[t,,...,¢,], the point here is just that the product of two 
Ox, OX, 

polynomials of degree 1 is of degree 2). However, there is a very important 
expression in Y, and Y, which is again a first order operator, the so-called 
commutator 


[F,,F2] = B,D, — By. (2) 


The fact that the commutator is again a first order operator is most easily seen 
by interpreting Z, and Y, as derivations of the ring of functions, and checking 
by substitution that ‘ 9, and J, satisty (1) then so does [Z,, Z, ]. In coor- 


dinates, if J, = ye and 9,=Y, O— ~ then 


é aQ, OP. 
h R,=>){P,— ty. 3 
Bx? where ( k 3x, - 0,2) (3) 


LF,1,9,] = 


One sees directly from this that [9,,2,] is a first order operator, but the 
definition (2) has the advantage that it is intrinsic, that is, does not depend on 
the choice of coordinate system x,,..., x,, While this cannot be seen directly from 
the expression (3). Via the interpretation as differential operators, the commuta- 
tor operation can be transferred to vector fields. Here it is called the Poisson 
bracket, and it also denoted by [6,, 0, ]. 

The vector space of vector fields together with the bracket operation [ , | is 
very similar to a ring. Indeed, if we interpret [ , ] as a multiplication, then all 
the axioms of a ring (or even of an algebra) will be satisfied, with the single 
exception of the associativity of multiplication; instead of which, the bracket 
operation satisfies its own specific identities: 
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[2,2] =0 
and 
[LF1,F21,F3) + (192,93), 911 + 193,91], F2] = 0. 


These follow easily from the definition (1). The second of these 1s called the Jacobi 
identity. It is a substitute for associativity, and as we will see, is closely related 
to associativity. 

A set # having two operations of addition a + band commutation (or bracket) 
[a,b] is called a Lie ring if it satisfies all the axioms of a ring except for the 
associativity of multiplication, in place of which 


[a,a] = 0, 
and 
[La, bj,c] + [Lb,c],a] + [le,a],b] = 0 (4) 
for all a, b, ce ZY. If is a vector space over a field K and [ya,b] = [a,yb] = 
y[a,b] for all a, be Y and ye K then Y is called a Lie algebra over K; the 
element [a, b] is called the commutator of a and b. It follows from the relations 
[a,a] = 0 that [b,c] = —[c,b] for all b, c (you need to seta = b + c). 


Example 1. Vector fields on a manifold with the Poisson bracket operation 
form a Lie algebra (over R or C, depending on whether the manifold is real or 
complex analytic). 


Example 2. All the derivations of a ring A form a Lie ring with respect to the 
commutation operation (2). If A is an algebra over a field K < A then derivations 
satisfying the condition D(a) = 0 for ae K form a Lie algebra over K. The 
verification of this is the same as for differential operators. 


Example 3. Let A be an associative, but not necessarily commutative, ring. For 
a,b e€ A we set [a,b] = ab — ba. With this bracket, A becomes a Lie ring. If A is 
an algebra over a field K, then we get a Lie algebra over the same field. The 
verification of this is again the same as for differential operators. 

If A = M,(K) is the n x n matrix algebra then the algebra we obtain is the 
general linear Lie algebra, denoted by gl(n, K) or gl(n). 

Note that the case of first order linear differential operators does not quite fit 
in Example 3, but we can take A to be the ring of all linear differential operators, 
and choose a subspace ¥ c A of first order operators, which although not a 
subring (because it is not closed under multiplication ab), is closed under [a, b] = 
ab — ba. Obviously we then get a Lie algebra. The analogous method applied to 
the algebra A = M,(K) gives the following new important examples; (the prop- 
erty of being closed under commutation is easily checked). 


Example 4. Consider the subspace Y c M,(K) consisting of all matrixes of 
trace 0; # is called the special linear Lie algebra and denoted by sl(n, K) or sl(n). 


Example 5. # < M,(K) consists of all skew-symmetric matrixes a, with 


a* = —a, (5) 
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where * denotes transposition. & is called the orthogonal Lie algebra, and 
denoted by o(n, K) or o(n). 


Example 6. Suppose K = C, and that Y < M,(C) is again characterised by (5), 
but where * denotes Hermitian transposition. Then # is a Lie algebra over R, 
called the unitary Lie algebra, and denoted by u(n). Imposing the additional 
condition Tra = 0, we get the special unitary Lie algebra, denoted by su(n). 


Example 7. Suppose K = H is the algebra of quaternions, and that Y < M,(H) 
is again characterised by (5), where now * denotes quaternionic Hermitian 
transposition. Then the Lie algebra Y over R is called the unitary symplectic Lie 
algebra, and denoted by spu(n). 


Example 8. Let J be a 2n x 2n nondegenerate skew-symmetric matrix over a 
field K, and Y c M,,(K) the set of a € M,,(K) for which 


aJ + Ja* = 0. (6) 


f is called the symplectic Lie algebra, and is denoted by sp(2n, K) or sp(2n). 
The origin of the terms introduced in Examples 4—8 will become clear shortly. 
We say that a Lie algebra ¥ is finite-dimensional if it is finite-dimensional as 

a vector space over K; the dimension over K is called the dimension of Y and 

denoted by dim # or dim, Y. 

For example, the Lie algebra of vector fields in a region of 3-space is infinite- 


0 0 0 
dimensional over R, since a vector field can be written as A— + B— + C—, 
Ox oy Oz 


where A, B, C are any differentiable functions. The algebra glI(n) and the algebras 


of Examples 4-8 are finite-dimensional: 
—1 
dim gl(n) = n?, dimsl(n) =n? —1, dimo(n) = ma". 


dimgu(n) = n?, dimgsu(n) = n* — 1, 
dimgspu(n) = 2n? +n, dimgsp(n, K) = 2n? +n. 


An isomorphism of Lie rings and algebras is defined exactly as for associative 
rings. For example, it is well known that all 2n x 2n nondegenerate skew- 
symmetric matrixes are conjugate (over a field K). It follows easily from this that 
the algebras defined by condition (6) for different matrixes J are isomorphic 
(which is why J is not indicated in the notation sp(2n, K)). Here is a less trivial 
example of an isomorphism. 


Example 9. The vectors of Euclidean 3-space under the vector product opera- 
tion[ , ] forma Lie algebra ¥ over R (the Jacobi identity (4) is well known in 
this case). Suppose we assign to each vector a the linear map @,(x) = [a, x]. The 
scalar triple product formula shows that @, is skew-symmetric, that is p¥ = —q,. 
On the other hand, for any skew-symmetric linear map ¢ of R°, there exists a 
vector cé€ R°® with {c| = 1 such that g(c) = 0. Then @ also induces a skew- 
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symmetric transformation in the plane orthogonal to c, that is (if @ 40) a 
rotation through 90° and multiplication by a number k. It is easy to check that 
then @ = g, where a = kc. Thus at~ Q, is a 1-to-1 linear map of ¥ to 0(3). It 
follows at once from the Jacobi identity that this map is an isomorphism of Lie 
algebras. Thus # is isomorphic to 0(3). 

By analogy with the case of associative algebras, we define the notions of 
subalgebra, homomorphism and ideal of Lie algebras (because of the relation 
[b,a] = —[a,b], there is no distinction between left, right and 2-sided ideals), 
and of simple algebra and the quotient algebra by an ideal. The analogue of the 
homomorphisms theorem holds. We say that a Lie algebra (or ring) Y is Abelian 
(or commutative) if [a,b] = 0 for all a, be & As in the associative case, for a 
finite-dimensional algebra with basis e,, ..., e, the bracket operation is defined 
by structure constants c;;,, with 


Le;,e,] = > Cin en 


B. Lie Theory 


The subject matter here is the study from the infinitesimal viewpoint of Lie 
groups in a neighbourhood of the identity, that is, differential calculus on the 
level of groups. The analogue of differentiation is to associate with a Lie group 
a certain Lie algebra; one also determines to what extent a Lie group can be 
reconstructed from the corresponding Lie algebra, thus constructing an analogue 
of the integral calculus. 

We start our treatment of this theory (as it arose historically) with the example 
of a Lie group G acting on a manifold X. Such an action is given by a map 


o:GxxX-X (7) 


(see § 12). Introducing coordinates u,,..., u, in a neighbourhood of the identity 
eeéG and x,, ..., X,, in a neighbourhood of x, € X, we specify the map by 
functions 


Dy (Uy, 6s Uns X 15020. Xmdo ee 9s Din(Uys e+ 9 Uni X15--+5Xm)- 


For definiteness we assume in the following that these are real analytic, and 
similarly, we assume that the group law of the Lie group is real analytic. Other 
versions (n times differentiable, or complex analytic functions) are considered in 
exactly the same way. 

If G is the group R of real numbers under addition, then the action (7) defines 
a 1-parameter group of transformations of X. In mechanics, (x, t) for t € R and 
x € X can be interpreted as the motion of a point of configuration space X as 
time t varies, and ‘infinitesimally small’ motions have been considered for a very 
long time. By this, one means the velocity field 6 of the transformation p(t, x) at 
time t = 0; in coordinates, this is 
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9 = OD, (t, X15-++5Xn) 


fori =1,...,m. 
l 5° 9 
Ot (t=0) 


The corresponding differential operator is defined by the condition 


0 
DP)(x) = 2 (FPL Mle=oy- 


In the case of an arbitrary group G, Hermann Weyl proposes to imagine the 
manifold X as being filled with a material admitting motions that correspond to 
the action (7) of G. Here also, we can consider velocity fields of the corresponding 
motions. Each such field is determined by a vector € of the tangent space T, ¢ to 
G at the identity e. The action (7) defines a map of the tangent spaces 


(d@)te,x): T..g¢ © Ty,x > Th, x, 


where T,, y is the tangent space to X at x. For any vector € € T, g, the vector 
(d@)e,x(¢ ® O) € Ty defines the required vector field 0, on X. It 1s easy to see 


. . oo, 09; . 
that in coordinates this is given by the formulas 3 for i=1,..., m, and is 


analytic. The corresponding differential operator on X is of the form Y.(F)(x) = 
OF (¢(g, x)) 
0g 
that it 1s intrinsically defined (independently of the choice of the coordinate 
system). The map ¢++8, is obviously linear in ¢ and hence defines a finite- 
dimensional space ¥ of vector fields on X. The basic fact is that the finite- 
dimensional family # of vector fields constructed in this way is closed under 
Poisson brackets, and therefore defines a certain Lie algebra. 

We explain the reason for this in the only case that will appear in what follows: 
when g is the left regular action, that is, when X = G and 9(g,,92) = 9192 (for 
g, € Gand g, € X = G). The left regular action commutes with the right regular 
action: the left action of g, € G is of the form g++ g,g, and the right action for 
g, € G of the form g++ gg3', so the fact that these commutes just expresses the 
associativity of the group law: (g,9)9>° = g,(gg3'). From this, by an obvious 
formal verification, we see easily that for any € € T, ¢ the vector field 6.(x) = 
(dP), x)(€ ® O) is also invariant under the right regular action. In other words, 
the tangent vector 6:(gg;") is obtained from the tangent vector 0.(g) by means 
of the differential d(g{') of the right regular action g+> gg{!. Vector fields 9 with 
this property are said to be right-invariant (see the remark in § 15 following the 
definition of a Lie group); such a field is uniquely determined by the vector 6(e), 
and any tangent vector n € T, « defines a vector field 6 with 0(e) = n: the vector 
0(g) is then obtained from y by the right translation taking e to g. Thus the vector 
space of right-invariant vector fields on G is isomorphic to the space T, ;. The 
space of vector fields Y = {6,(-) = (de), .(€ ® OE € T,.g} constructed above 
consists as we have just said of right-invariant fields, and hence is isomorphic to 
a subspace of T, ,. But the map ¢+- @, has no kernel, as one sees easily, so that 


(differentiating with respect to the argument g). One sees from this 
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dim Y = dim T, g, and therefore # consists of all the right-invariant vector 
fields. Finally, that the set of all right-invariant vector fields is closed under 
commutation follows from the obvious relation: if f is a transformation of a 
manifold X taking a point x to y, and 6’, @” are vector fields on X then 


(df).[6,, 97] = L(df),@., (df), J. 


Thus the family ¥ of vector fields we have constructed is a Lie algebra, called 
the Lie algebra of G and denoted by Y(G). We have obtained the result: 


Theorem. The Lie algebra L(G) of a group G consists of all vector fields 
9(°) = (d@e,($ BO) fore Teg, 
where o:G x G-—>G is the group law of G. It also coincides with the set of 


0 
differential operators of the form Q.(F)(g) = — F(@(g,¥)), where € € T, ¢, and 
g a é , 


differentiation is with respect to the second argument y. Finally, (G) equals the 
algebra of right-invariant vector fields on G, or of right-invariant first order 
differential operators. 


The structure constants of Y(G) can be expressed very explicitly in terms of 
the coefficients of the group law g(x, y) of G in a neighbourhood of e. Since 
¢ € T, gis uniquely determined by the values 9,(x;)(e) for coordinates x,,..., Xp; 
we only have to find these values for the commutators [%,, J, ]. From the fact 
that o(x, e) = x and ¢(e, y) = y it follows that the terms of degree 1 and 2 in the 
series for ~(x, y) are of the form 


p(x, y)=x+yt+ Bix,y)+-":, (8) 


where B(x, y) is linear both in x and in y. A simple substitution shows that 
Q.D,(x;)(e) = B(é,n);, where B(¢, ny); is the ith coordinate. Hence 


[¢,7] = B(¢,n) — B(n, ¢). (9) 


(8) and (9) show that the degree 1 terms in the group law ¢~(x, y) are the same for 
all Lie groups of the same dimension (they are the same as those of R”). But the 
degree 2 terms define the Lie algebra Y(G). 

The invariant nature of the definition of the Lie algebra #(G) make a number 
of its natural properties almost obvious. If f: G + H is a homomorphism of Lie 
groups, then df defines a homomorphism #(G) > #(H), whose kernel is the Lie 
algebra of the kernel of f. If H is a closed subgroup of a Lie group G then 4(H) 
is a subalgebra of #(G), and if H is a normal subgroup then Y(H) is an ideal of 
L(G) and £(G/H) = #(G)/ L(A). If g: G x X > X is an action of a Lie group 
ona manifold X then the family 7 = {6,(-) = (d@) ec, .(€ © OVE € T.,¢} of vector 
fields on X defined above is a Lie algebra, and is a homomorphic image of the 
Lie algebra of G: Y = L(G)/F(N) where N is the kernel of the action. 

Formula (9) shows that the Lie algebra of an Abelian group is commutative. 
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Example 10. Let G = GL/(n). In a neighbourhood of the identity matrix we can 
write A= E+ X,andifB=E+Ythen AB=E+X+Y+XY. Thus in (8), 
we have B(X, Y) = XY, and (9) shows that the commutator of elements X and 
Y in the Lie algebra 7(GL(n)) is of the form X Y — YX, that is, Y(GL(n)) = gl(n). 
If X is the matrix E + (x,), where x,; are coordinate functions, then a vector 


€ eT; - goes to the matrix ox = oxy 
e,G £ og _ “ala ° 


Example 11. Now let G = SL(n). Then Gc GL(n) is the subgroup de- 
fined by det(E + X)=1. A tangent vector € to GL(n) is tangent to SL(n) 


iS (det(E + X)) = 0. But, as is well known, 
7) ox 
— X)) = Tr{ — }. 
pe GeulE + )) ( 


ox 
Hence for € tangent to SL(n) we have Me = 0, and therefore Y(SL(n)) = sl(n). 


Example 12. Similarly, ifG = O(n)and E + X e Gthen(E + X)(E+ X*)=E. 


Hence 
| SE + Obie +x) E+ x) SE + xm} | = 0 
ag ag (K=0) 
OX OX \* 
or (=) + (=) = 0. Thus #(O(n)) = o(n). 


Example 13. As we indicated at the beginning of § 15, the group SO(3) is the 
configuration space for a rigid body moving with a fixed point: the motion of 
dg 
dt 
space T,,,, to SO(3) at the point g(t). We can transform it by right translation g™' 


d 
to the vector y(t) = = g' € T,, that is, to an element of the Lie algebra 0(3). From 


d 
the fact that g(t) is orthogonal, that is g(t)g(t)* = e, it follows that = g* + 
dg \* 
(2) = 0, that is y(t)* = —y(t), according to Example 12. If some point 
d 
of the body moves according to the law x(t) = g(t)(x,.) then obviously ~ = 


dg(t 
a ) (Xo) = y(Og(t) (Xo) = y()(x(O). By Example 9, corresponding to the trans- 


such a body gives a curve g(t) € SO(3). A tangent vector — belongs to the tangent 


formation y(t) there is a vector w(t) such that y reduces to vector multiplication 
dx(t) 
dt 
velocity of points of the body are the same as under the rotation with constant 
angular velocity w(t); the vector w(t) is called the instantaneous angular velocity. 


by w(t). Hence 


= [w(t), x(t) ]. This equation shows that at each instant t the 
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We could transform the vector a by a left translation into }(t) = g™' “ é T,. It 
is easy to see that ) = g™'yg, and the corresponding vector © = g™‘w is the 
instantaneous angular velocity in a coordinate system rigidly attached to the 
body. 

Exactly the same argument as in Examples 11 and 12 allows us to determine 
the Lie algebras of the other Lie groups we know: 


£(U(n)) = u(n), L(SU(n)) = su(n), L(Sp(n) = sp(n), 
and Y(SpU(n)) = spu(n). 


We recall again that all the preceding arguments are also applicable to complex 
Lie groups: the corresponding Lie algebras Y(G) are Lie algebras over C, as one 
sees easily. In particular, 


L(GL(n, C)) = gl(n, C), L(SL(n, C)) = sl(n, C), 
L(O(n, C)) = o(n,C) and L(Sp(2n, C)) = sp(2n, C). 


We proceed to the second part of Lie theory, to the question of the extent to 
which a Lie group G can be reconstructed from its Lie algebra #(G). Here there 
are two possible way of stating the problem. Firstly, we could study the group 
law ¢~ only in some neighbourhood of the identity of the group. If we introduce 
coordinates x,,..., X, in this neighbourhood, then the group law is given by n 
power Series 


(x, y) = (Dy (X 1 566 6s Xn Vis-++sVn)s osey Pn(X15+++>XnsVi0-++9Vn))s 


These must satisfy the associativity relation (x, p(y, z)) = p(@(x, y),z)) and the 
existence of an identity m(x,0) = g(0, x) = x (the existence of an inverse, that is, 
of a power series w(x) satisfying o(x, (x)) = o(w(x), x) = x follows easily from 
this by the implicit function theorem). Geometrically, this formulation of the 
problem corresponds to the study of local Lie groups, that 1s, analytic group laws 
defined in some neighbourhood V of 0 in R” (the product is an element of R’, 
but possibly not of the same neighbourhood), and satisfying the associativity 
axiom and the existence of an identity, which is 0. We say that two local Lie 
groups defined in neighbourhoods V, and V, are isomorphic if there exists 
neighbourhoods V; c V, and V; < V, of 0 and a diffeomorphism f: V; — V; 
taking the first group law into the second. A homomorphism of local Lie groups 
is defined similarly. Under this formulation of the question the answers is very 
simple. 


Lie’s Theorem. Every Lie algebra £ is the Lie algebra of some local Lie group. 
A local Lie group is determined up to isomorphism by its Lie algebra. Every Lie 
algebra homomorphism 9: £(G,) ~ L(G,) between the Lie algebras of two local 
Lie groups is of the form o = (df). where f: G, > G, is a homomorphism of local 
Lie groups, uniquely determined by this condition. 
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The most elementary and striking form of this theorem is obtained if we use 
formula (9) for the commutation in Lie algebras. The theorem shows that already 
the degree 2 terms B(x, y) in the group law determines the group law uniquely 
up to isomorphism (that is, up to analytic coordinate transformations). If we 
consider g(x, y) as a formal power series, then the theorem takes on a purely 
algebraic character, without any analysis involved. It holds for ‘formal group 
laws’ over an arbitrary field of characteristic 0. We will return later to this 
algebraic aspect of Lie theory. 

When passing to global Lie groups, that is, to the by now usual definition 
given in §15, things become a little more complicated. Indeed, already the 
additive group R and the circle group R/Z are not isomorphic, although they 
both have the same Lie algebra, the Abelian 1-dimensional algebra. However, 
the ideal situation is reestablished if we restrict attention to connected and simply 
connected groups (compare § 14, Example 7). 

A theorem proved by Cartan asserts that the above statement of Lie’s theorem 
continues to hold word-for-word if we replace the term ‘local Lie group’ by 
‘connected and simply connected Lie group’. (In the assertion on homomor- 
phisms #(G,) > £(G,), it is enough for G, to be simply connected.) 

Lie theory can also be applied to the study of connected but non-simply 
connected Lie groups, since the universal cover G of a group G can be made into 
a group (in a unique way) such that G = G/N, where N is a discrete normal 
subgroup contained in the centre of G. This gives a construction of all connected 
Lie groups having the same Lie algebra. An example of a representation in the 
form G = G/N is G = O(n), G = Spin(n), N = {E, —E}; or G = PSL(n, C), G = 
SL(n, C) and N = {sE|e” = 1}. 


C. Applications of Lie Algebras 


Most of the applications are based on Lie theory, which reduces many ques- 
tions of the theory of Lie groups to similar questions on Lie algebras, which are 
as a rule simpler. Thus the most direct method of deducing the classification of 
simple Lie groups discussed in §16 consists of the classification of simple Lie 
algebras and then application of Lie-Cartan theory. For example, it is proved 
that over the complex numbers field, there exist the following simple finite- 
dimensional Lie algebras: sI(n,C), o(n, C), sp(n,C) and a further 5 exceptional 
algebras of dimensions 78, 133, 248, 14 and 52, denoted respectively by E,, E., 
E,, G, and F,. By Lie-Cartan theory this gives the classification of complex 
connected simple Lie groups. Each Lie algebra corresponds to one simply 
connected group G, for example sI(n, C) corresponds to SL(n, C); and a quotient 
group of the form G/N, where N is a discrete normal subgroup contained in the 
centre of G, has the same Lie algebra. Since for each of these groups the centre 
Z is itself finite, we obtain together with each simply connected group a finite 
number of quotient groups, as we mentioned in § 16. 
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In exactly the same way, the theory of simple real Lie groups reduces to the 
theory of simple Lie algebras over R. Their study is carried out by methods 
analogous to those discussed in §11, where we studied simple algebras and 
division algebras over non-algebraically closed fields. More precisely, exactly as 
we did in §11 for associative algebras, one can also define the operation of 
extension of the ground field 4. = LY &, K’ in the case that # is a Lie algebra 
over a field K, and K’ an extension of K. One proves that if Y is a simple algebra 
over R, then Le is either a simple algebra over C, or a direct sum of two 
isomorphic simple algebras. Thus the problem of studying simple algebras over 
IR reduces to a similar problem over C. It is precisely in this way that we can give 
substance to the notion of ‘real analogues of a complex Lie group G’ as discussed 
in § 16. 

Finally, we indicate the connection of Lie algebras with mechanics. 


Example 14. Consider the motion under inertia of a rigid body with a fixed 
point. Since no external forces act on the body, the law of conservation of angular 
momentum gives 


dJ 
— = Q), 10 
At (10) 
Suppose that the motion is described by a curve g(t) € SO(3), as in Example 13. 
Introducing the angular momentum J = g™'J in a system of coordinates rigidly 
attached to the body, we rewrite (10) after obvious transformations in the form 
J + [6, J] = 0 (11) 
dt aan 
These equations (there are 3 of them corresponding to the 3 coordinates of the 
vector J) are called the Euler equations. They can be viewed as equations for J, 
since the relation between @ and J is realised by the inertia tensor J, 


J = 1(8), (12) 


where I is a symmetric linear transformation independent of t. The transforma- 
tion J determines the kinetic energy by the formula 


T = 3(J, 6) = $(1(@), &), (13) 


and is therefore positive definite. 

In Formula (11), [ , ] denotes the vector product, but according to Example 
9, it can also be interpreted as the commutator bracket of the algebra 0(3). In 
this form the equations (11) can be generalised to a very wide class of Lie groups 
and Lie algebras. Under this, the energy T is interpreted as a Riemannian metric 
on a Lie group G, invariant under left translations (since the transformation I in 
(13) is constant); it is defined by a symmetric transformation #(G) > ¥(G)*. 
Since (by analogy with the case of a rigid body and the group SO(3)) we take for 
@ an element of the Lie algebra L(G), we have J € L(G)*, and for it we can write 
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down the equation (11). It turns out that in a number cases such equations have 
interesting physical meaning. For example, the case of the group G of all motions 
of Euclidean 3-space (that is, G = O(3): T where T is the group of translations) 
corresponds to the inertial motion of a body in an ideal fluid. The case of the 
group SO(4) also has physical meaning. But the most interesting 1s the case of 
the ‘infinite-dimensional Lie group’ of all diffeomorphisms of a manifold, which 
has the Lie algebra of all vector fields as its Lie algebra. This case is related to 
phenomena such as the motion of an ideal fluid. However, it does not fit into the 
standard theory of Lie algebras and groups, and the theory would seem to be at 
present on a heuristic level. 


D. Other Nonassociative Algebras 


The theory of Lie algebras shows in a very convincing way that deep and 
important results of the theory of rings are not necessarily connected with the 
requirement of associativity. Lie algebras are perhaps the most vivid examples 
of nonassociative rings of importance for the whole of mathematics. But there 
are others. 


Example 15. As discussed in § 8, Example 5, quaternions can be written in the 
form z, + Zz, j, where z, and z, are complex numbers. In this form, the multiplica- 
tion of quaternions is very simple to describe: if we assume that all the ring axioms 
are satisfied (and that operations on complex numbers are the same as in C), then 
we only have to specify that j? = —1 and jz = Zj. We can attempt to go further 
in the same direction, defining an 8-dimensional algebra consisting of elements 
gq, + q,! where q,, q, are quaternions, and | a new element. It turns out that in 
this way we arrive at an interesting nonassociative division algebra. If we postu- 
late all the ring axioms, except for associativity of multiplication, then we only 
need to specify the products g,-q>, 4, °(42!), (¢,!):q2 and (q,!)-(q2!). We will 
assume that quaternions multiply as elements of H, and set in addition 


91(92!) = (4291) (Qi) 42 = (41492)! 
and 
(q1!)°(q2l) = —(4241); 


where q denotes the quaternion conjugate of q. 
In other words multiplication is defined by the formula 


(Py + P2!)(qi + G2!) = Pids — G2P2 + (G2Pi + P24,)I- 


All the ring axioms with the exception of the associativity of multiplication can 
easily be checked. The element 1 € H is the identity of the new ring. It is an 
8-dimensional algebra over R: a basis is given for example by {1,i, j,k, l, il, jl, kl}. 
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The algebra just constructed is called the Cayley algebra or the algebra of 
octavions, or the Cayley numbers, and denoted by QO. 
For ue O with u = q, + ql, set 


U=4,—Qel, |ul=/lqil? +1q2|? and Tru =Req,. 


It is easy to see that uu = uu = |u|, and that Tr wv = Tr vu. The element u satisfies 
the quadratic equation 


u? — (Tru)u + |u|? = 0. (14) 


Theorem. The algebra O has the following property, a weakening of the require- 
ment of associativity: the product of 3 elements does not depend on the distribution 
of brackets if 2 of the elements coincide. In other words, 


u(uv) = (uu)v, (uv)v = u(vv), u(vu) = (uv)u 
(the third of these is a consequence of the first two). 


Rings satisfying this conditions are called alternative rings. It can be proved 
that in an alternative ring, the subring generated by any two elements is 
associative. 

It follows from the properties given above that for u 4 0 the element u~' = 
|u|~?u is an inverse of u, and u(u~!v) = v, (veu7!)u = 0, that is, O is a (nonassocia- 
tive) division algebra. 

It is not hard to prove that |wv| = |u|-|v|. In the basis 1, i, j, k, |, il, jl, kl this 


gives a curious identity 
~ allo v2 a 
(5 Xi v)=(% i), 
i=1 i=1 i=1 


where z; are integral bilinear forms in x,,..., Xg and y,,..., Vg. 

The existence of the algebra O is the reason underlying a whole series of 
interesting phenomena of ‘low-dimensional’ geometry (in dimensions 6, 7 and 8). 
For example, we observe that for any plane E = Ru + Rv c QO, the set of all 
elements we O for which wE c E defines a subalgebra C(E) isomorphic to 
the complex numbers; this is easy to verify (you need to use the fact that O 
is alternative; the subalgebra C(E) is spanned by 1 and a = vu"). Any 6- 
dimensional subspace F < O can be given by the equations Tr(xu) = 0 forue E, 
where E is some plane. It follows from this that aF c F for ae C(E) (you 
need to use the easily verified relation Tr(u(vw)) = Tr((uv)w)). Thus every 6- 
dimensional subspace F < O has a natural structure of 3-dimensional vector 
space over C. In particular, if X < O is a smooth 6-dimensional manifold, then 
this applies also to its tangent spaces at different points, and the resulting complex 
structure varies smoothly with the point. We say that a manifold with this 
property is almost complex. The standard example of an almost complex manifold 
is a complex analytic manifold. We see that any orientable 6-dimensional sub- 
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manifold in R® is almost complex. However, it is very rare to have a complex 
structure on such a manifold. For example, the almost complex structure on S° 
arising in this way is not defined by any complex structure. 

Another application of the Cayley numbers relates to the exceptional simple 
Lie groups (see § 16), for example, the compact ones. Namely, the compact 
exceptional simple Lie group G, is isomorphic to the connected component of 
the automorphism group of the algebra O. The groups E, and F, are also realised 
as the 2-dimensional ‘projective’ and ‘orthogonal’ groups related to O. 

Finally, the algebra of Cayley numbers plays a special role from a purely 
algebraic point of view. A generalised Cayley algebra is an 8-dimensional algebra 
consisting of elements of the form q, + q2/, where q, and q, belong to some 
generalised quaternion algebra over a field K (see § 11), and the multiplication is 
given by 


(Pp: + p2!)(qi + Gol) = ids + Y42P2 + (G2Pi + P24), 


where y # 0 is some element of K. This algebra is always alternative and simple. 
It is a nonassociative division algebra if and only if the equation q,q; — yq2q2 =9 
has no nonzero solutions in the quaternion algebra. 


Theorem. Any alternative division algebra is either associative or isomorphic to 
some generalised Cayley algebra. Any alternative simple ring is either associative 
or isomorphic to a generalised Cayley algebra. 


As with associative rings, we can construct projective planes with ‘coordinates’ 
in an arbitrary alternative division algebra. This is one of the simplest example 
of non-Desarguian projective planes (see § 10). They have a simple geometric 
characterisation: a certain weakened form of Desargues’ theorem should hold. 

There are certain other types of nonassociative algebras, for which a fairly 
complete theory can be constructed (at least under the assumptions of finite 
dimensionality and simplicity), and which have mathematical applications. But 
no general theory of nonassociative algebras exists at present (at least, not in the 
sense of a theory to be placed alongside that of associative algebras or Lie 
algebras). Perhaps such a theory is just not possible? Indeed, an arbitrary 
finite-dimensional nonassociative algebra is given by a multiplication table c;;, 
with absolutely no restrictions, so is an arbitrary tensor Ce Y® L@ &Y*, 
defined up to transformations of Aut(#). But the question then arises: under 
what types of natural restrictions should such a theory exist? How to understand 
from a unified point of view the theory of simple associative algebras, Lie 
algebras, alternative algebras and certain other types? A test problem could be 
the structure of nonassociative division algebras over R. Here there is a remark- 
able fact: such division algebras can have dimensions 1, 2, 4 or 8 only. However, 
an algebraic proof of this fact is not known. The existing proof is topological, 
based on the study of topological properties of the map (IR" ~ 0) x (R” ~ 0) > 
(R” \ 0) defined by the multiplication of the algebra (which one identifies with R"). 
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§ 20. Categories 


The notion of categories, together with certain notions related to them, forms 
a mathematical language having a specific nature as compared with the standard 
language of set theory, and imparting a somewhat different character to mathe- 
matical constructions. Our description starts with an example. 

We will use diagrams, representing sets by points, and maps from a set X to a 
set Y by arrows from the point for X to the point for Y. If a diagram has points 
representing sets X, A,, A,,..., A,, Y, and maps f,: X > A,, f,: A, 7 Ab),..., 
fn+1: An @ Y, then f,4,...f2f;, is a certain map from X to Y, the composite of f,, 
...>Sn¢1- Lf for all sets X and Y appearing in the diagram, and for any choice of 
the sets A; and maps f; the resulting composite map from X to Y arising 1s 
the same, we say that the diagram is commutative. Examples of commutative 
diagrams: the diagrams 


A y B A y 8B 
, 7 ,. 
h 
u g and u g 
v v 
. aa e . of ° 
C D C D 


are commutative if vu = gf and h = vu = gf respectively. 

We now proceed to the example itself. In set theory, there are two operations 
defined on arbitrary sets X and Y (these are not considered to be subsets of a 
fixed set): the sum or disjoint union of sets, denoted by X + Y, and the product 
denoted by X x Y, consisting of pairs (x, y) with x e X and ye Y. These opera- 
tions can be described not by a construction, as we have just done, but by their 
general properties. For example, for the sum X + Y we have two inclusion maps 
f:X ~X + Yandg: YX + Y, and the following universal mapping property 
holds: for any set Z and maps u: X — Z and v: Y — Z, there exists a unique map 
h: X + Y > Z for which the diagram 


x —~.x+y% y 
u Dv 
h (1) 


is commutative. In exactly the same way, the product X x Y has projec- 
tion maps f: X x YX and g: X x Y-Y, and for any set Z and maps 
u: Z — X and v: Z > Y, there exists a unique map h: Z — X x Y for which the 
diagram 
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xt xyxy ty 


u v 


Z 


is commutative (that is, fh = u and gh = v). Obviously h(z) = (u(z), v(z)). 

As a next step we can consider sets in which certain special types of maps are 
distinguished, and consider what constructions are defined by requiring that 
universal mapping properties (1) or (2) hold. For topological spaces and con- 
tinuous maps between them, we obtain, of course, the notions of disjoint union 
and product of topological spaces. The case of groups, where only group homo- 
morphisms are considered as maps, is more interesting. For Abelian groups, and 
more generally for modules over a given ring, the direct sum A @ B has both 
embeddings A ~ A @® B and BA @B and canonical projections A @ B— A 
and A @ B — B, and both of the universal mapping properties of (1) and (2) are 
satisfied (of course, u, v and h are now homomorphisms, rather than arbitrary 
maps). Thus here the analogues of the two operations of set theory, disjoint union 
and product, come together. But this is not the case for non-Abelian groups: on 
the direct product G x H, there are canonical homomorphisms G x H —> G and 
G x H > H which satisfy the universal mapping property (2); but although the 
inclusion maps f:G—G x H and g:H—G x H are defined, the universal 
mapping property (1) does not hold. To see this, it is enough to take a group K 
having two subgroups isomorphic to G and H, but whose elements do not 
commute with one another, and let u and v be isomorphisms of G and H with 
these subgroups. Since elements g € Gandh €e HcommuteinG x H, the diagram 


g—t.6¢xH—" a 


u v 


K 


will not be commutative for any homomorphism h. Nevertheless, there does 
exist a construction of a group satisfying the universal mapping property (1). It 
is called the free product of G and H. This is the group generated by two 
subgroups G’ and H’ isomorphic to G and H, with no relations between the 
elements of G’ and H’ other than the relations holding in G’ and H’ individually. 
A precise definition can be given by analogy with the definition of a free group 
(§ 14, Example 6). In particular, the free group S, on two generators is the free 
product of two infinite cyclic groups. 

Let us treat another variation on the same theme: we consider as sets com- 
mutative rings which are algebras over a given ring K, and as maps K-algebra 
homomorphisms between them. The direct sum A @ B with its canonical projec- 
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tions f: A® BA and g: A® BB satisfies the universal mapping property 
(2). But although there are natural inclusion maps f: A A@B and g: B> 
A @ B, the analogue of the universal mapping property (1) does not hold: the 
point is that for a € A and be B, we always have ab = 0in A @ B; however, there 
might exist a ring C containing subrings A’, B’ isomorphic to A, B, but for which 
the relation ab = 0 does not always hold; then a homomorphism h: C > A @® B 
with the required property does not exist. It is easy to see that in this case the 
tensor product A ®  B (see § 12, Example 3) is a ring with the required property. 

In conclusion, let us check that in all the cases we have considered, the 
construction satisfying the universal mapping property (1) or (2) is unique. 
Suppose for example that we are considering the diagram (1), and that for given 
sets X and Y we have two such sets R and S. Then the diagram 


f g 


xX —~ R +«+— Y 


‘ |): ; (3) 


S 


must be commutative. Hence f = hu and u = kf, that is f = (hk) f, and similarly 
g = (hk)g. The requirement that the map h in (1) is unique, applied to the trivial 
case S = R, implies that hk is the identity map of R to itself; in the same way, we 
prove that kh is the identity map of S to itself. Hence R and S are isomorphic. 

Now we draw the moral from the example we have considered. All the pre- 
ceding arguments involved sets and certain maps between them. But we never 
needed to consider what kind of elements our sets were made up of, or how these 
elements transformed under our maps; the only thing we needed was that maps 
can be composed, and that different maps can be compared with one another—as 
we saw particularly vividly when using the commutativity of diagrams in the 
final argument concerned with diagram (3). This is the approach axiomatised in 
the notion of a category. 

A category @ consists of the following data: 

(a) A set Ob(@), whose elements are called the objects of @; 

(b) for any A, B € Ob(@), a set H(A, B) whose elements are the morphisms in 
6 from A to B; 

(c) for any A, B, C € Ob(@), and any f € H(A, B) and g € H(B, C), a morphism 
h € H(A, C) called the composite of f and g, and written gf; 

(d) for any A € Ob(@) a morphism 1, € H(A, A), called the identity morphism. 
The above data must satisfy the conditions: 


h(gf)=(hg)f for fe H(A, B),g € H(B,C) and he H(C,D); 
and 


fi,=1pf=f for fe H(A, B). 
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Thus in category theory, an object A € Ob(@) is characterised not as a set, in 
terms of the elements it is made up of, but in terms of its relations with other 
objects B €c Ob(@). Thus we are interested primarily in its ‘relations’, rather than 
its ‘construction’. 

In the following examples of categories we take the point of view of ‘naive set 
theory’, ignoring logical contradictions that arise when we operate with such 
notions as ‘all sets’, ‘all groups’, and the like. Methods (involving the notion of 
a class) have been established by specialists to allow us to get around these 
contradictions (at least in the opinion of the majority of specialists). 


Example 1. The category “%e¢ whose objects are arbitrary sets and whose 
morphisms are arbitrary maps between sets. 


Example 2. The category whose objects are arbitrary subsets of a given set X, 
and whose morphisms are the inclusion maps between them (so that H(A, B) 1s 
either empty or consists of a single element). Variation: X 1s a topological space, 
the objects are the open subsets, and the morphisms are inclusions between them. 


Example 3. The category %o% whose objects are topological spaces, and 
whose morphisms are continuous maps between them. Variations: the objects 
are differentiable (or analytic) manifolds, and the morphisms are differentiable 
(or analytic) maps between them. Another important variation: the objects are 
topological spaces (X,x,) with a marked point x), and the morphisms are 
continuous maps f: X > Y taking the marked point of X into the marked point 
of Y, that is, f(xo) = yo. This category is denoted by ZefZp. 


Example 4. The categbry Hot of topological spaces up to homotopy equiva- 
lence. Two continuous maps f: X — Y and g: X — Y between topological spaces 
are homotopic if one can be continuously deformed into the other, that is, if there 
exists a continuous map h: X x I + Y (where I = [0,1] is the unit interval) such 
that h(x,0) = f(x) and h(x, 1) = g(x). Two spaces X and Y are homotopy equiva- 
lent or have the same homotopy type if there exist continuous maps f: X > Y 
and g: Y— X such that gf is homotopy equivalent to the identity map 1, of X 
and fg to 1y. The objects of #Ho¢ are topological spaces, and morphisms between 
them are continuous maps up to homotopy equivalence; spaces with the same 
homotopy type become isomorphic in #o¢/. The category #o4, is defined by 
analogy with Example 3. 


Example 5. The category .@od, whose objects are modules over a given ring 
R, and whose morphisms are homomorphisms between them, that is H(M, N) = 
Hom,(M, N). The category Ged z of Abelian groups is denoted by 9. 


Example 6. The category of groups: the objects are arbitrary groups, and the 
morphisms homomorphisms between them. 


Example 7. The category of rings: the objects are arbitrary rings, and the 
morphisms homomorphisms between them. Variations: we consider only com- 
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mutative rings, or only algebras over a given ring A (in the latter case, we take 
the morphisms to be A-algebra homomorphisms). 


Example 8. We now give an example of a category whose morphisms are not 
defined as maps between sets. This example is related to the formal group laws, 
mentioned in the previous section in connection with Lie theory. The more 
standard terminology is a formal group: a formal group is an n-tuple @ = 
((,,...,@,) of formal power series 


PAX, Y) = (X15... Xns Vio--+9 Vn) 


in two sets of variables, X = (x,,...,X,) and Y =(y,,..., y,), with coefficients in 
an arbitrary field K, and satisfying the conditions 


p(X, (Y, Z)) = p((X, Y), Z), 
p(0,0)=0, and g(X,0) = o(0,X) = X. 


The number n 1s called the dimension of the formal group g. A homomorphism of 
an n-dimensional group g into an m-dimensional group yw is an m-tuple F = 
(f1,--+>Sm) Of formal power series in n variables such that 


F(0)=0, and (F(X), F(Y)) = F(@(X, Y)). 


The objects of our category are formal groups defined over a given field K, and 
the morphisms are homomorphisms between them. If K is a field of characteristic 
0, the study of our category reduces completely by Lie theory to the study of the 
category of finite-dimensional Lie algebras over K and homomorphisms between 
them. But if char K = p > 0, a new domain with quite specific properties arises. 
The theory of 1-dimensional formal groups is already far from trivial, and has 
important applications in algebraic geometry, number theory and topology. 


Example 9. A category with a single object @. In this case, the category is 
determined by the set H(@,0), which is an arbitrary set with an associative 
operation (composition) and an identity element. This algebraic notion is called 
a semigroup with unit. 


Example 10. The dual category @*. For every category @, the dual category 
@* has the same objects as G, but the morphisms H(A, B) in @* are given by 
H(B, A) in @, and the composite of two morphisms f and g in @* is defined as 
the composite of g and f in @ If we imagine a category as being a single diagram, 
in which the objects are represented as points and morphisms as arrows between 
them, then @* is obtained from @ by reversing the direction of the arrows. 

The notion of the dual category leads to a certain duality in category theory. 
Namely, any notion or assertion of category theory can be applied to @* to give 
a dual notion or assertion in @, obtained from the first by ‘reversing the arrows’. 

Returning to the example treated at the beginning of this section, we can now 
define operations on objects in any category analogous to the sum and product 
of sets. For this, we need to use diagrams (1) and (2), which are meaningful in 
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any category; the corresponding objects, if they exist, are called the category- 
theoretical sum and product of objects. As we have seen, these do not always exist. 
For example, the sum does not exist in the category of finite groups, but does 
exist in the category of all groups. However, if the sum or product exists, then 
they are unique up to isomorphism (objects A and B are isomorphic if there exist 
morphisms f € H(A, B) and g € H(B, A) such that gf = 1, and fg = 1,): the proof 
given above is meaningful in any category. We can say that in the category of 
modules, sum and product both coincide with the direct sum of modules; in the 
category of groups, sum coincides with free product, and product with direct 
product of groups; in the category of commutative rings, sum coincides with 
tensor product and product with direct sum. In the category of topological 
spaces, sum and product coincide with the same operations on sets. In the 
category of topological spaces with a marked point, the product of spaces (X, x) 
and (Y, yo) is their ordinary product X x Y with marked point (xo, yg); but the 
sum 1s the so-called bouquet X v Y, consisting of X and Y glued together at the 
points x, and yo. For example, the bouquet of two circles is a ‘figure-eight’ 
(Figure 40). 


Fig. 40 


Since diagrams (1) and (2) which serve as the definition of sum and product 
are obtained from one another by reversing the arrows, these notions are dual 
to one another, that is, they go into one another on passing from the category 
@ to the dual category @*. 

The idea of an ‘invariant’ or ‘natural’ construction in the language of categories 
is expressed through the notion of a functor. A covariant functor from a category 
@ to a category Y consists of two maps (denoted by the same letter), a map 
F: Ob(@) — Ob(Y), and for any A, BE @ a map F: H(A, B) > H(F(A), F(B)), 
satisfying the conditions: 


F(l4) = 14) for Ae Ob(@) 
and F(fg)=F(f)F(g) whenever fg is defined in @. 


For example, in the category of vector spaces @, the map Et T’E taking a 
vector space to its r-fold tensor product is obviously compatible with linear maps: 
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if f: E— E’ is a linear map then f(x, ®-::@ x,) = f(x,) © °:: @ f(x,) defines a 
linear map T’E > T’E’. It is easy to see that together these two maps define a 
covariant functor F = T’, the r-fold tensor product; this is a functor from @ to 
itself. 

A contravariant functor is also given by a map F: Ob(@) — Ob(Q), but this time, 
for A, B € Ob(@) it defines a map 


F: H(A, B) > H(F(B), F(A)) 
(in the reverse order), with the conditions 


F(14) = Ip) and F(fg) = F(g)F(f) 


(also in the reverse order). A typical example of a contravariant functor is the 
operation of taking a vector space into its dual vector space. 


Example 11. Let A be a commutative ring and M, N two A-modules. Fixing 
the module N, we set Fy(M) = M @, N, and take a homomorphism f: M > M' 
into the homomorphism F(f): M@,N—M’'@,N given by F(f)(m@ hn) = 
f(m) ® n; then Fy is a covariant functor from Wed, into itself. We set Gy(M) = 
Hom ,(M, N), and for f: M > M’ and ge Hom,(M’,N), write G(f)(@) for the 
composite gf; then Gy is a contravariant functor from Med, into itself. If R is 
a noncommutative ring then Gy(M) = Hom,(M, N) is a contravariant functor 
from Godp into A. 

Here are some more examples, in which we only indicate the effect of the 
functor on Ob(@): the reader will easily guess its action on the sets H(A, B). 


Example 12. The standard constructions of topology are functors. Consider the 
path space H(I, X) of a topological space X: this is the set of continuous maps 
: I + X of the interval J = [0,1] into X. The topology of H(J, X) is determined 
by the requirement that for each open set U c X, the set {g|@(I) < U} should 
be open. Since for any map f € H(X, Y) composing with f takes a path ge 
H(I, X) into a path fg € H{I, Y), it follows that H(/, X) is a covariant functor 
from the category ZZ into itself. Most frequently it is considered on the category 
Tofy of topological spaces (X,x 9) with a marked point. Then by definition 
H(I, X) consists only of those maps g: I — X for which g(0) = xo. Finally, if in 
H(I, X) we consider only those maps for which @(0) = (1) = Xo (that is 
H(S', X), where S! is the circle with a marked point), then we get QX, the loop 
space of X. All of these covariant functors carry over naturally to functors on 
the homotopy category #o¢ of Example 4. 


Example 13. The majority of topological invariants are groups, and are func- 
tors from the category ZoZ% or Hod into the category of groups or of Abelian 
groups. Thus the fundamental group 2(X) (§ 14, Example 7) is a covariant functor 
from the category of topological spaces with a marked point into the category 
of groups; the homotopy groups z,,(X) for n > 2, the homology groups H,(X, A) 
and the cohomology groups H"(X, A) (the definition of which will be discussed 
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in § 21 below) are functors from the same category into the category of Abelian 
groups. The functors z, and H, are covariant and H” are contravariant. All of 
these objects are invariant with respect to homotopy equivalence, and pass to 
the category Hof. 


Example 14. An important functor is the space of functions F(X, R) on a set 
X (say real-valued). Here a number of variations are possible: if X is discrete (for 
example, finite) then we consider all maps, if X is a topological space, continuous 
maps, if X has a measure, square-integrable functions, and so on. Since a map 
J: X > Y takes a function g € ¥(Y, R) into of € F(X, R), it follows that F(X) = 
F(X, R) is a contravariant functor from the category of sets (or topological 
spaces, or spaces with measure, ...) into the category of vector spaces. It is 
precisely due to the fact that F(X, R) is a functor that any transformation of X 
corresponds to an invertible linear transformation of F(X,R), and if G is a 
transformation group of X then ¥(X, R) has a representation of G defined on it. 
In particular, if X = G and we consider the action of G on itself by left transla- 
tions, then ¥(X, R) is the regular representation of G. 


Example 15. In an arbitrary category @, any object A € Ob(@) defines a co- 
variant functor h, from @ to the category of sets Ye¢; we set h,(X) = H(A, X), 
and for any f « H(X, Y), we define the map h,(/): H(A, X) ~ H(A, Y) as com- 
position with f, that is h,(f)(g) = fg for g « H(A, X). In the same way h4(X) = 
H(X, A) defines a contravariant functor. We have already met the functors 
h*(M) = Hom,(M,N) on the category Ged, (Example 11) and hg:(X) = QX 
on Fofy (Example 12). 

The functors h, and h4 are useful in a very general situation when we want to 
Carry over some construction defined for sets to any categories. If ® denotes 
a set-theoretical construction and ¥ is the construction in a category @ we are 
looking for, then we require that the functors hy,,, and ®(H,) are equivalent 
(applying the functor h4 in place of h, gives a different dual construction). Here 
we say that two functors F, and F, from @ to Ye¢ are equivalent if for any 
X € Ob(@) we can define an invertible map gy: F,(X) > F,(X) such that for any 
Y € Ob(@) and any f € H(X, Y) the diagram 


F,(X) —“> F,(X) 


F(f) | F(f) 


F,(Y) —“ > F,(Y). 


is commutative. For example, it is easy to see that the definition of sum A + B 
in a category reduces to the requirement that the functors h,,,(X) and h,(X) x 
hg(X) are equivalent. The product is related in a similar way to the functor h4. 
As an application, we discuss the very important notion of a group (or group 
object) in a category @. We will suppose that products exist in @ A group law on 
an object G € Ob(@) is defined as a morphism p € H(G x G,G); the operation 
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of passing to the inverse is determined by specifying a morphism 7 € H(G, G). 
Finally, the existence of a unit element is simplest to state if we assume that in 
@ there exists a final object O € Ob(@), that is, an object such that H(A, ©) consists 
of a single element for every A € Ob(@) (a single point in %e¢ or Fog, the zero 
group in WZ, etc.). Then the identity element is defined as a morphism «¢ € H(@,G). 
These three morphisms, p, 1 and ¢ should be subject to a number of conditions 
which express the associativity of multiplication and the other group axioms; 
these can be stated as requirements that certain diagrams be commutative. But 
we will not list these here, since a simpler requirement which is equivalent to all 
of these conditions is that for any A € Ob(@) they define a group law on the set 
H(A, G) (maps into a group themselves form a group!), and for any f € H(A, B) 
the composition map H(B, G) > H(A, G) is a homomorphism. In other words, 
the functor h® should be a functor from @ into the category of groups. 

For example, a Lie group is a group object in the category of (differentiable 
or complex analytic) manifolds. 


Example 16. Consider the loop space QX over a topological space X with a 
marked point x, (Example 12). Composition of loops, as defined in § 14, Example 
7, defines a continuous map p: QX x QX > QX, reversing a loop defines a map 
1: QX — QX, and the loop reduced to x, defines a unit element. This data does 
not define a group: for example, the product of a loop with its inverse is not equal 
to the unit loop, but only homotopic to it. But in the category #o¢, (Example 
4) we do obtain a group, as one checks easily. 


Example 17. We are going to use the operation of contracting to a point a 
closed subset A ofa topological space X. By this we mean the topological space, 
denoted by X/A, which set-theoretically consists of X ~ A plus a single extra 
point a; the contraction map p: X — X/A is then defined as being the identity 
on X ~ A and taking A to a. An open subset of X/A is defined to be a set whose 
inverse image under p is open in X. 

The suspension of a topological space X is the space XX obtained from the 
cylinder X x I (where J = [0,1]) by contracting its top and bottom faces X x 0 
and X x 1 to two points (see Figure 41). 


Xx 0 <> 
Ixagl i 
a al 


2X 


Fig. 41 
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In the category of spaces with a marked point, we consider the reduced 
suspension SX obtained from 2X by contracting also x, x J to a point, which 
is taken as the marked point of SX. The properties of SX are simplest to state 
using the smash product operation X A Y in Jofy and Hod,. By definition, for 
spaces (X, X,) and (Y, yo) we set 


XA Y=X x YX X yoUXq X YL). 


It is easy to check that this operation is distributive with respect to the sum 
operation X v Y (see Figure 40), that is, 


XA(YVZ)=(X A Y)v(X AZ) 
and (4) 
(Xv Y)AZ=(X AZ)V(YA Z). 


In particular, the reduced suspension is given by SX = S! ~ X where S' is 
the circle, obtained from [0,1] by glueing the points 0 and 1. It is easy to see 
that SS' = S* ~ S' = S?, and more generally SS" = S"* (where S" is the n- 
dimensional sphere). 

Identifying the points 0 and 1/2 in the circle S' = 1/{0,1} gives a map S' > 
S' v §! (Figure 42), 


Fig. 42 


and hence in Zofy and Ho¢y, a map 
SX >~SX + SX 


(since v is the sum in this category, SX = S' ~ X and ~ is distributive). By 
duality we have a morphism wu: (SX) x (SX)— SX in the dual category. Re- 
versing the circle, corresponding to the symmetry of S‘ with respect to the point 
1/2 defines a map S' > S* and hence 1: SX > SX. Mapping the whole space to 
a point defines a unit element. It is easy to see that in this way SX defines a group 
object in the category Hods. 


Example 18. The groups in #e¢, and #o¢* constructed in Examples 16-17 
define important invariants which are now ordinary groups; this is for the simple 
reason that by definition, if G is a group object in a category @ then H(A, G) is 
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a group for any A € Ob(@). Similarly, if G €e Ob(@) is a group in the dual category 
@* then H(G, A) is a group for any A € Ob(@). Hence if we make any choice of 
a topological space R (with marked point), then for any topological space X both 
H(X, QR) and A(SR, X) will be groups, and functors from the category HoZ 
into the category of groups. We have already met one of these: if R consists of 
two points, then SR = S' (Figure 43) and H(SR, X) is the fundamental group 
n(X). But since S” = SS"™', also H(S", X) is a group for any n > 1. This is denoted 
by z,(X) and is called the nth homotopy group of X. In particular, 1(X) = 71,(X). 


R={x,29>| xxl | tyx! 


SR 
Rx] 21 


Fig. 43 


For n > 2 the group z,(X) is Abelian, and the reason for this is also categorical: 
it consists of the fact that for n > 2, S” = S(SS"~*), and for any spaces R and X 
the group H(S(SR), X) is Abelian. In fact, the representation SSR = S' a S‘' AR 
allows us to define two maps 


SSR — SSR v SSR 


using the map S! > S! v S! of Figure 42 for either the first factor S' in SSR = 
S! ~ S!} ~ R or the second. From this we get two group laws on H(S(SR), X), 
which we denote by - and *. They have a distributivity property: (f-g) *(u:v) = 
(f *u)*(g*v); this can be checked very easily from the definitions using the 
distributivity property (4) of ~. Moreover, both operations have the same unit 
element e. But it follows formally from this that the two operations coincide and 
are commutative: 


fg =(fee) (e*g) =(f-e)ale- gh=f*g, 
and 
gf =(erg) (fees fye(gao=fegaf-g. 
Similarly, H(X, QQR) is an Abelian group for any spaces R and X. 


We can also give a definition of cohomology groups in the same spirit. For 
this one proves that for any choice of an Abelian group A there exists a sequence 
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of spaces K,, for n = 1, 2, 3,... with marked points, such that 
TAK,)=0 for m#0,n 
1,(K,) = A, (5) 
and 
K,-, = QK,, (in HoZo). 


Then just as before, the set H(X, K,,) is an Abelian group; this is called the nth 
cohomology group of X with coefficients in A, and denoted by H"(X, A). (This is 
not the most natural method of defining cohomology groups, nor the original 
one: this will be discussed in the next section.) For the n-sphere S", we have 


H™(S",A)=0 for m#0,n, 
H"(S", A) = A, 
and 
S"=SS"" (in Hof), 


so that the spheres S” are in this sense analogous to the spaces K,, (with the groups 
n, instead of H'). 

Many of the most splendid achievements of topology (for example, those 
connected with the study of the groups z,,(S")) are based on the ideology which 
we have tried to hint at in the preceding constructions: the category #Ho¢p can 
to a significant extent be treated as an algebraic notion, in many respects 
analogous, for example, to the category of modules, and the intuition of algebra 
can successfully be applied to it. 


§21. Homological Algebra 


A. Topological Origins of the Notions of Homological Algebra 


The algebraic aspect of homology theory is not complicated. A chain complex 
is a sequence {C,},-7 of Abelian groups (most often C, = 0 for n <0) and 
connecting homomorphisms ¢,: C, > C,,,, called boundary maps; a cochain com- 
plex is a sequence {C”},, . 7 of Abelian groups and homomorphisms d,: C" > C"*?, 
called coboundary maps or differentials. The boundary homomorphisms of a 
chain complex must satisfy the condition 0,0,,, = 0 for all ne Z, and the co- 
boundary of a cochain complex the condition d,,,d, = 0. Thus a complex is 
defined not just by the system of groups, but also by the homomorphisms, and 
we will for example denote a chain complex by K = {C,,0,}. 
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The condition 0,¢,,, = 0 in the definition of a chain complex shows that 
0,41(C,,41), the image of ¢,,,, is contained in the kernel of ¢,, that is, Imd,,, < 
Ker 0,. The quotient group 


H,(K) = Ker ¢,/Im 0,4, 


is called the nth homology group of the chain complex K = {C,,6,}, and denoted 
by H,(K). Similarly, for a cochain complex K = {C",d,}, we have Imd,_, < 
Ker d,; the group Kerd,/Imd,_, is called the nth cohomology group of K, and 
denoted by H"(K). 

Here are the two basic situations in which these notions arise. 


Example 1. An n-dimensional simplex or n-simplex is the convex hull of n + 1 
points in Euclidean space not lying in a(n ~— 1)-dimensional subspace. A complex 
is a set made up of simplexes meeting along whole faces, such that the complex 
contains together with a simplex also all of its faces, and such that every point 
belongs to only a finite number of simplexes. As a topological space, a complex 
is determined by its set of vertexes, together with the data of which vertexes 
form simplexes. Thus this is a finite method of specifying a topological space, 
analogous to defining a group by generators and relations. A topological space 
X homeomorphic to a complex is called a polyhedron, and a homeomorphism 
of X with a complex is called a triangulation of X. Thus a triangulation is a 
partition of a space into pieces which are homeomorphic to simplexes, and which 
‘fit together nicely’. Figure 44 shows a triangulation of the sphere. Spaces which 
arise in practice usually admit triangulations; for example, this is the case for 
differentiable manifolds. But there are many such triangulations, just as there are 
many representations of a group by generators and relations. 


f 


Fig. 44 


We can associate with each complex X a chain complex K = {C,, 0,}n>0- Here 
C, = Zo; is the free Z-module with generators corresponding to n-simplexes 
o, of X. To define the homomorphisms 0,, each simplex g; is oriented, that is, a 


§21. Homological Algebra 215 


definite ordering o; = {Xo,...,x,} of its vertexes is chosen. We then set 


0,0; = yy (— 1)*e,07, 


where o;‘ is the simplex {xo,...,X,—-1,Xz41)+--s Xn}, and ¢, = 1 or —1, depending 
on whether Xo, ..., X,-1, Xya4, +--+» X, IS an even or odd permutation of the 
sequence of vertexes of of in the chosen orientation; @, is then extended to the 
whole group C, by additivity. The property 0,0,,, = 0 is easy to check. Elements 
x, € Kero, are called cycles, and y, € Im¢é,,, boundaries; the groups H,(K) are 
called the homology groups of X, and denoted by H,(X). The geometric meaning 
of an element x € H,(X) is that it is a closed n-dimensional piece of the space X, 
and two pieces are identified if together they bound an(n + 1)-dimensional piece. 
For example, in Figure 45, (a) c and c’ are closed curves on the torus defining 
the same element of H,(X), and in Figure 45, (b) the curve d on the ‘double torus’ 
is the zero element. 


(a) (b) 


Fig. 45 


The basic property of the groups H,,(X), which is completely nonobvious from 
the definition we have given, is that they do not depend on the triangulation 
of the polyhedron X, but only on X as a topological space. Moreover, they 
define covariant functors from the category Ho¢ to the category # of Abelian 
groups. In other words, a continuous map f: X > Y induces homomorphisms 
fan: H,(X) > H,(Y), depending only on the homotopy class of f, and satisfying 
the conditions in the definition of a functor. 

It is precisely the ‘functorial’ character of the groups H,(X) which make them 
so useful in topology: they determine a ‘projection’ of the topology into algebra. 
We give a very simple example. It is easy to prove that the n-dimensional sphere 
S" satisfies 


H,(S") = Z = H,,(S") 
and A,(S")=0 fork 40orn; 


on the other hand, the n-dimensional ball B” has the homotopy type of a point, 
since it can be contracted radially to the centre, and so H,(B") = 0 for k £0. We 
now give a proof of the famous Brouwer fixed point theorem: every continuous 
map &: B" > B" has a fixed point. For otherwise, for any point x € B", draw the 
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ray from (x) to x and write f(x) for its intersection with the boundary S""! of 
B". Then f is a continuous map, and f(x) = x for x € S""'; that is, fo i = 1, where 
i is the inclusion i: S"™! <, B" of S""! as the boundary sphere, and 1 is the 
identity map of S""*. Now functoriality gives a map fx—1): H,-1(B) > H,-;(S), 
which must be the zero map (since H,_,(B) = 0). But on the other hand, 
fan—1) ° ix—1) = 1, which gives a contradiction. 

Alongside the chain complex K = {C,,¢,} constructed above for an arbi- 
trary polyhedron X, we can take any Abelian group A and construct the 
chain complex K @7A = {C, ®7 A,0¢,} and the cochain complex Hom(K, A) = 
{Hom(C,, A), d,,}; recall that the functor F(C) = C @7 A is covariant and G(C) = 
Hom(C, A) is contravariant (see §20, Example 11). The group H,(K ® A) is 
denoted by H,,(X, A), and H"(Hom(K, A)) = H"(X, A). These are called the homo- 
logy and cohomology groups of X with coefficients in A. Homology is covariant 
and cohomology is a contravariant functor from #o¢ to the category W/Z. The 
groups H"(X, A) have already been mentioned at the end of § 20. 


Example 2. Let X be a differentiable manifold and Q" the space of differen- 
tial r-forms on X. If pe Q" is a form, written in local coordinates as g = 
» Si,...i,4X;, A °7* A dx;,, then the differential of ¢ is the form 


dp =) df, N dx;, A+: A dx;; 


this expression does not depend on the choice of the coordinate system and 
defines a homomorphism d,: Q" > Q'*'. The relation d,,,d, = 0 is not hard to 
verify. Therefore K = {Q",d,} is a cochain complex; its cohomology H’(K) is 
called the de Rham cohomology of X and denoted by H},(X). By analogy with 
the exterior algebra /\(E) = G /\‘(E) of a vector space E (see § 5, Example 12), 
we can consider the graded ring Q(X) = GQ’ of all differential forms on X. The 
operation of taking exterior products extends from Q(X) to the group H§,(X) = 
CD Hbp(X), which becomes a graded ring (and a superalgebra). 

The connection between Examples 1 and 2 is based on the operation of 
integrating a differential forms along a chain. More precisely, we can find a 
triangulation of a manifold X sufficiently fine that every simplex o; is contained 
in some coordinate neighbourhood, and sufficiently smooth that the homeomor- 
phism f;: ; > 6; between o; and the standard simplex 6; in Euclidean space is 
differentiable as often as we like. Then setting [, 9 = )'n;|,, 9 for o = )n,o; € C, 
and @ € Q” reduces the definition of the integral over a chain o to the definition 
of the integral [,,@ over a single simplex. Now using the diffeomorphism 
Ji: 6, > o; reduces it to integrating the form f,*@ over the simplex G; in Euclidean 
space, that is, to the computation of an ordinary multiple integral. 


Stokes’ Theorem (Generalised Form). For a form 0"! € Q"" and a chain 


c,é C,, 
| go" -| do" 
ac,. C, 
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Since an integral over a chain depends in an additive way on the chain, the 
proof of this theorem reduces to the case that C, is a simplex in Euclidean space. 
In this situation, it reduces to the well-known theorems of Green and Stokes for 
r = 1, 2 or 3, and is proved in the general case in exactly the same way. Intro- 
ducing the notation (c, g) = J, for c € C, and g € Q", and extending the pairing 
(c,~) toc € C, ® R turns Stokes’ theorem into the assertion that the two opera- 
tors 0 on C,@ R and d on are dual maps. Considering the pairing (c, g) for 
0c = dg = 0, it follows from this that (c, g) vanishes if either c = dc’ or gp = dq’, 
and hence it induces a pairing between H,(X, R) and H)p(X). 


de Rham’s Theorem. The pairing thus constructed is a duality between these 
two spaces, so that de Rham cohomology provides an analytic method of computing 
the homology of a manifold. An equivalent formulation is that Hpp(X) is iso- 
morphic to H"(X,R). 


This isomorphism allows us to transfer the multiplication in the ring Hj,(X) 
(see Example 2) to the group H*(X, R) = G) H’(X, R), which thus becomes a ring, 
the cohomology ring of X. Of course, there is also a method of defining a 
multiplication in H*(X, R) which does not use the relation with differential forms, 
and is not restricted to the case that X is a manifold. 

Now we return to the algebraic theory of complexes, restricting ourselves to 
cochain complexes; the theory of chain complexes is obtained simply by reversing 
the arrows. A subcomplex of K = {C",d,} is a complex K, = {C{,d,} for which 
the groups C{ are subgroups of C”, and the differentials between them are 
obtained by restricting the differential d, defined on C” so that, in particular, 
d,(C{) < Ci**. In this situation, the d,, induce differentials on the groups C? = 
C"/C{, and we get a complex K,, called the quotient of K by K,, and denoted 
by K/K,. 

The cohomology groups of the complexes K, K, and K, = K/K, are con- 
nected by important relations. If we write Kerd, for the kernel of d, in C”, then 
by definition we get 


H"(K) = Kerd,/d,_,(C""') and H"(K,) =(Kerd,C’)/d,_,(Ct'). 


Since Ci? < C"™' and d,_,(C7"') < d,_,(C" *), sending an element of (Ker d, 9 
Ci')/d,-,(Cj_') into its coset modulo the bigger subgroup d,,_,(C""'), we get a 
homomorphism i,: H"(K,) > H"(K). Similarly, using the homomorphism C" > 
C5 = C"/Ci, we get an equally obvious homomorphism j,: H"(K) — H"(K,). 

There is another homomorphism, which is rather less obvious. Suppose that 
x € H"(K,); then x corresponds to an element y of Ker d, in C"/C}. Consider an 
inverse image y of y in C”. Since dy = 0 in C"*'/Ci*', we have dy € C7{*', and 
from the definition of a complex, it follows that dy € Ker d,,,. It is easy to prove 
that the coset dy + d,C{ defines an element of H"*'(K,) depending only on the 
original element x, and not on the choice of the auxilliary elements y and y, and 
that this gives a homomorphism 6,: H"(K,) > H"*!(K,). 
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All the homomorphisms we have constructed can be joined together into an 
infinite sequence 


~1 ~ 5y—1 
— H" ‘(K,) —> 


H"(K,) + H"(K) "+ H"(K,) —> (1) 
H"*1(K,) _ int _ 


This sequence has a very important property, which is most simply stated using 
the following extremely convenient algebraic notion. A sequence of groups and 
homomorphisms -:: > A,_, ——> A, —> 4,4; —— *** is exact if for each n the 
image of f,_, coincides with the kernel of f,. This condition is equivalent to saying 
that the {A,,f,} form a cochain complex whose cohomology is equal to 0. 
Conversely, the cohomology of any complex measures its failure to be exact. The 
exactness of a sequence 0 — A > B means simply that f is an embedding of A 
into B, whereas that of B *,C +0 means that g maps onto the whole of C. 
Finally, the exactness of 0 ~ A > B > C > 0 is just another way of saying that 
A c Band C = B/A; an exact sequence of this form is a short exact sequence. 

We can now state the basic property of the sequence (1) in the following 
compact form: 


Theorem (Long Exact Cohomology Sequence). If 
0-K,-K-—-K,-0 


is a short exact sequence of cochain complexes (that is, K, < K and K, = K/K,), 
the sequence (1) is exact. 


The proof is an almost tautological verification; (1) is called the long exact 
cohomology sequence. 

If X is a triangulated topological space, and Y a closed subspace of X made 
up entirely of some of the simplexes of the triangulation of X, then with Yc X 
we can associate chain complexes Ky and Ky, such that Ky <c Ky. Hence we 
have a short exact sequence of chain complexes 


O—- K,y— K,—K,/K, -0 
and, as one sees very easily, for any Abelian group A, an exact sequence of cochain 
complexes 
0 — Hom(K,/Ky, A) > Hom(Ky, A) — Hom(Ky, A) > 0. 
The cohomology of the complex Hom(K,/Ky, A) has an interpretation which 
is already a fact of geometry rather than algebra: for n + 0, 
H"(Ky/Ky, A) = H"(X/Y, A) 


where X/Y is the space obtained from X by contracting Y to a point. This 
assertion will also be true for H° if we modify slightly the definition of the 
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complexes K, in dimension n = 0 (for each of X, Y, and X/Y), restricting Cy to 
be the set of those 0-dimensional chains )'n,o? for which )'n, = 0. 

The resulting cohomology groups are denoted H°(X, A). The algebraic theo- 
rem on the long exact cohomology sequence then gives us the assertion that the 
sequence 


0 + H°(X/Y, A) > H°(X, A) > A°(Y, A) > °° 
+» A" 1(Y, A) (2) 
H"(X/Y, A) > H"(X, A) > A"(Y, A) > - 


is exact, where H"(X, A) = H"(X, A) for n > 0. From this it follows, for exam- 
ple, that if all H"(X,A) =0 then the groups A"(Y,A) and H"*!(X/Y, A) are 
isomorphic. 

This result can be viewed in the following way: H°(X, A) defines a functor from 
the category Wo to the category of Abelian groups. For spaces X, Y and X/Y 
this functor is related by an exact sequence 


0 + H°(X/Y, A) > H°(X, A) > AY, A), 


which however is not exact if we add —0 onto the right-hand end. But it can be 
extended to an exact sequence (2) by introducing an infinite number of new 
functors H"(X, A). This situation occurs frequently, and the particular case we 
have considered suggests the general principle: if for some important functor F(A) 
with values in the category of Abelian groups short exact sequences arise in 
natural situations, for example of the form 0 > F(B) > F(C) — F(A), then we 
should wonder whether it 1s not possible to define a family of functors F"(A) such 
that F° = F, and which are related by a long exact sequence of type (1) or (2). 
This is a completely new method of constructing functors. In the remainder of 
this section we will give two illustrations of this rather flexible principle. A third 
realisation of it is the subject matter of the following § 22. 


B. Cohomology of Modules and Groups 


We have already seen in §20, Example 11 that for a fixed module A over a 
ring R and a ‘variable’ module M, F(M) = Hom,(M, A) is a contravariant 
functor from the category of R-modules to that of Abelian groups. Therefore an 
exact sequence of modules 


o0>LimM4N 30 (3) 


defines homomorphisms  F(g): Hom,(N,A)—~ Hom,(M,A) and F(f): 
Hom,(M, A) — Hom,(L, A). It 1s easy to check that the sequence 


0 + Hom, (N, A) > Hom,(M, A) —> Hom,(L, A) (4) 


220 §21. Homological Algebra 


is exact, but will not remain exact if we add —0 onto the right-hand end. This 
means that the homomorphism F(f) does not have to be onto: this can be seen 
in the example of the exact sequence 0 > pZ/p*Z > Z/p?Z + Z/pZ —0 of 
Z-modules when A = Z/pZ. 

However, the sequence (4) can be extended preserving exactness. This relates 
to the groups Ext,(A, B) introduced in § 12, Example 2. One can show that, in a 
similar way to Hom,(A, B), the group Ext,(A, B) defines for fixed A a covariant 
functor G(B) = Ext,(A, B) from the category Ged, into WZ, and for fixed Ba 
contravariant functor E(A) = Ext,(A, B). In view of the exact sequence (3), the 
module M can be thought of as an element of Extp(N, L), and any homomor- 
phism g € Hom,(L, A) defines ahomomorphism G(q): Extp(N, L) > Extp(N, A). 
In particular, G(@)(M) € Ext,(N, A), and as a function of ¢ it defines a homomor- 
phism 0: Hom,(L, A) > Extp(N, A). One can prove that the sequence 


0 > Hom, (N, A) <> Hom,(M, A) —2> Hom,(L, A) -> Extg(N, A) 


is exact. But we can include this into the even longer sequence 
0 + Hom, (N, A) > Hom,(M, A) —2> Hom, (L, A) 
(5) 


4, Extg(N, A) > Extg(M, A) 2 Extg(L, A), 


(where E(g) and E(f) are the homomorphisms defined by the functor E(M) = 
Extp(M, A)), and the given sequence will also be exact! This of course sustains 
our hope of extending this to an infinite exact sequence. 

We will in fact construct a system of Abelian groups, denoted by Ext(A, B); 
for fixed n and fixed argument B these are contravariant functors of the first 
argument, and for any exact sequence 0 ~ L > M —> N - 0 of modules they are 
connected by the exact sequence (for all n > 0) 


.. 2 Ext ‘(L, A) > 
Extk(N, A) > Ext}(M, A) > Ext}(L, A) > (6) 
Ext 1(N, A) 7°: 
Here Ext? is just Hom, and Ext, is Ext. 

The idea of the construction of such a system of functors is very simple. 
Suppose that our problem is already solved, and that in addition we know some 
type of modules (which we will denote by P) for which these functors vanish, that 
1S, 

Ext,(P, A) =0 for all modules 4 and all n > 1. 
Suppose, in addition, that we have managed to fit a module N into an exact 


sequence 0 > L — P > N > 0 with P a module of this type, representing N as a 
homomorphic image of P. Then from the exact sequence (6) it will follow that 


§21. Homological Algebra 221 


the group Ext;,(N, A) is isomorphic to ExtR ‘(L, A), and we obtain an inductive 
definition of our functors. 

The problem then is to find the modules P which are to annihilate the as yet 
unknown functors Ext”. But a part of these functors is known, namely Extr = 
Ext,, and we must start by considering modules which annihilate this. We say 
that a module P such that Ext,(P, A) = 0 for every module A is projective. Stated 
more simply, this says that if P is represented as a homomorphic image of a 
module A, then there exists a submodule B < A such that A = P @ B (and the 
first projection A — P is just the given homomorphism A — P). Thus when we 
are dealing with projective modules, we have as it were got back to a semisimple 
situation. The simplest example of a projective module is a free module. In fact, 
let F be a free module over a ring R and x,,..., x, a system of free generators 
of F (which we assume finite only to simplify notation). If0 > L> M4 F 0 
is an exact sequence then g maps onto F, so that there exist elements y; € M such 
that g(y,;) = x;. Let M’ = Ry, +--+ Ry, be the submodule of M which they 
generate. From the fact that {x; = g(y;)} is a free system of generators, it follows 
that the same holds for {y,;}. From this it follows easily that g maps M’ 
isomorphically to F, and M = L @ M’, where L = Ker g. 

It turns out that the class of projective modules is already sufficient to carry 
out our program. Any module is a homomorphic image of a free module, and 
hence a fortiori of a projective module. Let 


0-L~P>~N-0 (7) 


be a representation of N as a homomorphic image of a projective module P. 
Suppose that the functors Ext,(L, A) are already defined for r <n W— 1, and 
set ExtR(N, A) = Ext /(L,A). We omit the definition of the homomorphism 
Exth(@): ExtR(L’, A) ~ ExtR(L, A) corresponding to ahomomorphism 9: L > L’, 
which 1s not difficult. It can be proved that both the groups Ext,(L, A) and the 
homomorphisms Extz(@g) do not depend on the choice of the sequence (7) for N, 
that is, they are well defined. They form a contravariant functor (for fixed n and 
A), and are related by the exact sequence (6). 

Putting together the n steps, we can obtain a definition of the functors Ext 
that is not inductive. For this, we represent the module L in (7) in a similar form, 
that is, we fit it into an exact sequence 0 > L’ > P’ > L > 0 with P’ projective, 
and then do the same with L’, and so on. We obtain an infinite exact sequence 


- > Py. 5 P, 4 P, “5 PWN 0 (8) 


with the P, projective modules, called a projective resolution of N. Applying the 
functor Hom,(P, A) to this, and leaving off the first term, we get a sequence 


Hom (Pp, A) ~°> Hom,(P,, A) ~2-- > Homa(P,, A), (9) 


which will not necessarily be exact, but which will be a cochain complex (from 
the fact that ~, o g,,, = 0 in (8), and it follows from the definition of a functor 
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that W,., ° W, = 0 in (9)). The cohomology of the complex (9) coincides with the 
groups Ext}(N, A). 


Example 3. Group Cohomology. The most important case in which the groups 
Extp(N, A) have many algebraic and general mathematical applications is when 
R = Z[G] is the integral group ring of a group G, so that the category of 
R-modules is just the category of G-modules. For a G-module A, the group 
Ext?(Z, A) (where Z is considered as a module with a trivial G-action) is called 
the nth cohomology group of G with coefficients in A, and denoted by H"(G, A). 

To construct a projective resolution of Z (or even one just consisting of free 
modules) is a technical problem that is not particularly difficult. As a result we 
obtain a completely explicit form of the complexes (8) and (9). Let us write out 
the second of these. It has groups C'(=Hom,(P,, A)) consisting of arbitrary 
functions f(g,,...,g,) of n elements of G with values in A. The differential 
d,: C" + C"*" is defined as follows: 


(Af) (G15--+5Gn41) = G1 (Gas+++sGnt1) + y (— 16 (G15---5GiGi+15--+sIn) 


+ (— 1)" (91,---5 In) 


(to understand the first term, you need to recall that f(g>,...,9,+,) € A and that 
A is a G-module, which explains the meaning of g, f(g2,...5Gn+1)): 
We write out the first few cases: 


n=0:f=aeaA and (df)(g) = ga—a; 

n=1:f(gjeEA and = (df)(91,92) = 91 f(92) — f(gi92) + £(91); 

n=2:f(9:,92)€A and (df)(91,92,93) = 91f(92,93) + f(91,9293) 
— £(9192,93) — 4(91+92). 


Thus H°(G, A) is the set of all elements a € A such that ga — a = OforallgeG, 
that is, the set of G-invariant elements of A. 

H'(G, A) is the group of function f(g) of g € G with values in A satisfying the 
condition 


I(9192) = f(91) + 91 f(G2), (10) 
modulo the group of functions of the form 
f(g)=ga—a withaeaA. (11) 


If the action of G on A is trivial then H'(G, A) = Hom(G, A). 
H?(G, A) is the group of functions f(g,,9,) of 9;, g2 € G with values in A 
satisfying the condition 


f(9159293) + 91. f(G2,93) = f(9192593) + £91592); (12) 


modulo the group of functions of the form 
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f(91592) = W(9192) — h(gi) — 9s h(92), (13) 


with h(g) any function of g € G with values in A. 
We give a number of examples of situations in which these groups arise. 


Example 4. Group Extensions. Suppose that a group J has a normal subgroup 
N with quotient J/N = G. How can we reconstruct I from G and N? We have 
already met this problem in § 16; recall that we say that J’ is an extension of G 
by N. We now treat this in more detail in the case that N is an Abelian group, 
and in this case we denote it by A rather than N. 

For any y € J andae A, the element yay is also contained in A. Furthermore, 
the map at yay! is an automorphism of A. Since A is commutative, yay~’ is 
not affected by changing y in its coset mod A; hence yay’ depends only on the 
coset g e G = [/A containing y. We therefore denote it by g(a). Thus G acts on 
A, and A becomes a G-module (but with the group operation in A written 
multiplicatively, as in J, rather than additively). 

We choose any-old-how a representative of each coset gé G=T/7/A, and 
denote it by s(g) eT. In general s(g,)s(g.) ¥ s(g,g2), but these two elements 
belong to the same coset of A; hence there exist elements f(g,,g,) € A such that 


5(9;)S(G2) = £(91592)5(9192). (14) 


The elements f(g,,g,) cannot be chosen at all arbitrarily. Writing out the 
associative law (s(g,)s(g2))s(g3) = S(g1)(s(g2)s(g3)) in terms of (14) we see that 
they satisfy the condition 


f(9159293)91(f(G92593)) = £(9192593)f(91,92), (15) 


which is the relation (12) written multiplicatively. It is easy to check that the 
structure of A as a G-module and the collection of elements f(g,,g,)¢€ A for g,, 
g, € G satisfying (15) already define an extension I of G by A. However, there 
was an ambiguity in our construction, namely the arbitrary choice of the coset 
representatives s(g). Any other choice is of the form s’(g) = h(g)s(g) with h(g) € A. 
It is easy to check that using these representatives, we get a new system of 
elements f’(g,,92), related to the old ones by 


f'(91592) = £(91,92)h(g1)91(h(g2))h(gig2) -. 


Thus taking account of (13), we can say that an extension of G by A is uniquely 
determined by the structure of A as a G-module and an element of H*(G, A). 
Let us stay for a moment with the case that the element of H?(G, A) corre- 
sponding to an extension is zero. By (14) this means that we can choose coset 
representatives s(g) for [’/A in such a way that s(g,g>) = s(g,)s(g2). In other 
words, these coset representatives themselves form a group G’ isomorphic to G, 
and any element y € J can be uniquely written in the form y = ag’ with ae A 
and g’ € G’. In this case we say that the extension is split, that I" is a semidirect 
product of A and G, and that G’ is a complement of A in I. For example, the group 
of motions of the plane is a semidirect product of the group of translations and 
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a group of rotations, and as a complement of the group of translations we can 
choose the group of rotations around some fixed point. Now for a split extension, 
how uniquely is a complement determined in general? The point is that we can 
change the coset representatives by s’(g) = f(g)s(g) with f(g) € A; it is easy to see 
a necessary condition for these to form a subgroup is the following relation: 


f(9192) = A(91) 91 f(92)), 


which is (10) written multiplicatively. But there exist a ‘trivial’ way of getting 
from one complement to another, namely conjugacy by an element a € A, which 
takes G’ into G” = aG’a™'. It is easy to see that f(g) goes under this into 
f(g)ag(a)~'. In view of (11) we deduce that in a semidirect product of A and G, 
complements of A, up to conjugacy by elements of A, are described by H'(G, A). 


Example 5. The Cohomology of Discrete Groups. Suppose that X is a topologi- 
cal space having the homotopy type of a point, and that a group I acts on X 
discretely and freely (see § 14). In this case we can construct a triangulation of X 
such that J acts freely on it. Then the group C, of n-chains of X will be a free 
I-module, and the chain complex {C,, ¢,} will be a projective resolution of Z (in 
fact made up of free /-modules). Now simply putting together definitions shows 
that for any Abelian group A considered as a J-module with trivial action, the 
cohomology groups H"(I, A) have a geometric realisation; they are isomorphic 
to the cohomology groups H"(I"\X, A) of the quotient space ’\X (see Example 
1). Any group J’ can be realised as a transformation group satisfying the above 
conditions. In this way we get a geometric interpretation of the groups H"(J, A). 

This situation is realised in particular if X = G/K where G is a connected Lie 
group, and K a maximal compact subgroup of G, since then X is homeomorphic 
to Euclidean space. Let [.< G be a discrete group without elements of finite 
order. Then J acts freely on the coset space G/K by left translations, and we have 
seen that H"(I’, A) = H"(I’\G/K,A). In particular, since [\G/K is a finite- 
dimensional space, H"(I, A) = 0 for all n from some point on. For the groups 
H"(I, Z) we can introduce the notion of the Euler characteristic 


WF, 2) = ¥(—1)" rank H"(, Z). 


We see that y(7, Z) = y(1’\G/K), where the right-hand side is the topological 
Euler characteristic of the space X, defined by y(X) = )|(— 1)" dimp H4(X, R). 

This is applicable in particular to the case when G is an algebraic group over 
Q, and J is an arithmetic subgroup I < G(Z) of finite index in G(Z) (see § 15.C). 
In this case y(J/, Z) often has a delicate arithmetic meaning; for example, it may 
be expressed in terms of values of the Riemann ¢-function at integers. Thus for 
any subgroup Jc SL(2, Z) of finite index and without any elements of finite 
order, 


_(SLQ,Z): 1) 


XU, Z) = (SL, Z): TF): ¢(—1) = s 


§21. Homological Algebra 225 


Finally, we mention yet another very important application of group cohomo- 
logy. Let K bea field and L/K a Galois extension of K with group G (see § 18.A). 
The group L* of nonzero elements of L under multiplication is a G-module, and 
the cohomology groups H"(G, L*) have very many applications both in algebraic 
questions and in arithmetic (when K is an algebraic number field). 


C. Sheaf Cohomology 


Let X be a topological space and @ the category whose objects are the open 
subsets of X, and morphisms inclusions between them (§ 20, Example 2). A 
contravariant functor from @ to the category of Abelian groups is called a 
presheaf of Abelian groups on X. Thus a presheaf FY assigns to each open subset 
Uc xX an Abelian group ¥(U), and to any two open sets Vc U a homo- 
morphism py: ¥(U) > F(V) such that pj = lg) and if WcoVcU then 


Pw = PwPy - 


Example 6. The Presheaf (, of Continuous Functions on X. By definition, 
Og¢(U) is the set of all continuous functions on U, and pj is the restriction of 
functions from U to V. In view of this example, the p/’ are also called restriction 
homomorphisms in the general case. 

A presheaf F is called a sheafif for any open set U c X and for any representa- 
tion U = | J U, of it as a union of open sets the following two conditions hold: 

1. For se A(U), if py.s = 0 for each U, then s = 0. 

2. If s, ¢ A(U,) are such that py". y,S. = Pu'nu,Ss for all a and f then there 
exists s€ F(U) such that s, = py_s for each «. 

Sheaves over a given space themselves form a category. A homomorphism of a 
sheaf F into a sheaf Y is a system of homomorphisms f,: F(U) ~ GU) for all 
open sets U < X, such that for all inclusions V c U the diagram 


F(U) —“ WV) 


is commutative, where p,’ and A! are the restriction maps for ¥ and Y We say 
that ¥ is a subsheaf of G if F(U) is a subgroup of (U) for every open set U c V. 

The definition of a sheaf indicates that the groups F(U) are specified by local 
conditions. For example, the presheaf of continuous functions (y is a sheaf, and 
if X is a differentiable manifold then the sheaf O,:.¢ < Og of differentiable func- 
tions is the presheaf for which ©,,,,(U) consists of all differentiable functions on 
U (that is, having derivatives up to a given order n < oo). Similarly, if X is a 
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complex analytic manifold, then the sheaf 0,,, of analytic functions is the presheaf 
for which @,,(U) consists of all analytic functions on U. All of these examples 
motivate a general point of view, which unifies the definitions of various types 
of spaces. The definition should consist of a topological space X, a subsheaf @ 
of the sheaf of continuous functions on X, and some supply .@ of models (such 
as the cube in R” with the sheaf of differentiable functions on it, or the polydisc 
in C” with the sheaf of analytic functions), with the requirement that each point 
x € X should have a neighbourhood U such that U together with the restriction 
to it of the sheaf © should be isomorphic to one of the models. This conception 
allows us to find natural definitions of objects which would otherwise not be easy 
to formulate; for example, complex analytic varieties with singularities (complex 
spaces). With a suitable modification this leads also to the natural definition of 
algebraic varieties over an arbitrary field, and their far-reaching generalisations, 
schemes. 

Other examples of sheaves: the sheaf 92’ of differential forms on a differentiable 
manifold X (here 2"(U) is the set of all differential forms on U), or the sheaf 
of vector fields. 


Example 7. Let X be a Riemann surface (a 1-dimensional complex analytic 
manifold), x,,..., x, € X any set of points of X and n,,...,n, any set of positive 
integers. The formal combination D = n,x, + °°: + n,x, 1s called a divisor (the 
condition n; > 0 is not usually assumed). The sheaf ¥, corresponding to D is 
defined as follows: Fp(U) is the set of meromorphic functions-on U having poles 
only at points x,,..., x, in U, such that the order of the pole at x; is at most n,. 

If f: ¥ + Gis a sheaf homomorphism then #(U) = Ker fy defines a subsheaf 
of F, called the kernel of f. Defining the image in the same way would not be 
good: the presheaf 4’(U) = Im fy is not a sheaf in general. But it can be embedded 
into a minimal subsheaf Y of Y: YA(U) consists of the elements s € Y such that 
for each point x € U there exists a neighbourhood U, in which py_s e€ Im fy,. 
This sheaf Y is called the image of f. Now that we have notions of kernel and 
image, we can define exact sequences of sheaves, repeating word-for-word the 
definition given for modules. For each sheaf ¥ and its subsheaf Y we can 
construct a sheaf # for which the sequence 0 — 9 — F¥ — # > Dis exact; H 1s 
called the quotient sheaf, # = F/G. 

The most important invariant of a sheaf ¥ on a space X is the group F(X). 
Usually, this is a set of all the global objects defined by local conditions. For 
example, for the sheaf of differential forms, it is the group of differential forms 
on the whole manifold, for the sheaf of vector fields, the group of global vector 
fields. In Example 7, it is the group of all functions which are meromorphic 
everywhere on a Riemann surface X, with poles only at x,,..., x,, and of orders 
at most n,,...,n,. From the definition of homomorphisms of sheaves it follows 
that taking a sheaf F into F(X) is a covariant functor from the category of 
sheaves to the category of Abelian groups. Let 


0-G- F¥ —- KH -—0 
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be an exact sequence of sheaves. It 1s easy to check that the sequence of groups 
0 - &(X) > F(X) > #(X) 


is exact, but does not remain exact if we add —0 onto the right-hand end. Here 
is an example of this phenomenon. Let X be the Riemann sphere, @,, the sheaf 
of analytic functions on X. We define a sheaf # by taking a finite set of points 
@ = {x,,...,X,} C X, assigning to each point x; one copy C; of the group of 
complex numbers, and setting #(U) = DC,, where the sum is taken over 
x,;€ ®OU. Taking a function f € @,,(U) into its set of values { f(x,)} € PC, at 
points x;¢ ® U defines a homomorphism 0,, — # with image #. If Y is the 
kernel then we get an exact sequence 0 > > 0,, ~ # — 0. Init, @,,(X) = C by 
Liouville’s theorem, and #(X) = C* by definition, and hence for k > 1 the 
sequence 0,,(X) ~ #(X) — 0 1s not exact. 

We are thus in the situation which we have already discussed, and our problem 
is to construct functors F” from the category of sheaves on X to the category of 
groups in such a way that F°(¥) = F(X), and for a short exact sequence 
0-G-F¥ > H 0 we have a long exact sequence 


er ay FP 1S) 
FY) > PF) > P(#) > (16) 
F°'tt(G) > 


We will argue in the same way as in constructing the functors Ext”. Suppose 
that functors F” with the required properties have been constructed, that we 
know a class of sheaves (which we will denote by 2) for which F"(2) = Oforn > 1, 
and that we have represented a sheaf ¥ as a subsheaf of such a 2. Then we 
have an exact sequence 0 > ¥ ~ 2 H# — 0 and a corresponding long exact 
sequence (16). From the fact that F"(2) = 0, we get that F"(F) = F""'(#), and 
this gives an inductive definition of the functors F”. 

Now we proceed to construct sheaves 2 “ie the required property. Since in 
particular such a Sheaf 2 will satisly F'(2) = 0, if 2 fits into an exact sequence 
0+>24F4G950then0> YX) > F(X) > G(X) — 0 must be exact (this fol- 
lows by considering the first 4 terms of the sequence (16)). One class of sheaves 
with this property is known, the so-called flabby sheaves: a sheaf 2 is flabby (or 
flasque) if the restriction homomorphisms pj: 2(X) > 2(U) maps onto AU) for 
all U < X. An elementary argument shows that if 2 is a flabby sheaf and 0 > 
I%3F4G-—OVanexact sequence of sheaves, then 0 + 2(X) 4 F(X) > GX) > 
0 is exact. 

A typical example of a flabby sheaf is the sheaf 2 for which 2(U) is the set of 
all (real or complex valued) functions on U; the sheaves Og, O4;,, and ©,, are all 
subsheaves of this. In a similar way, any sheaf is a subsheaf of a flabby sheaf. We 
are now in a situation which is completely analogous to that which we considered 
in §21.B, and we can give the definition of the new functors F" first of all by 
induction: if a sheaf ¥ fits in an exact sequence 0 > F¥ ~> 2G —0 with 9 
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flabby, then F"(F) = F""'(G); F°(F) = F(X). It can be proved that F"(F) does 
not depend on the choice of the exact sequence 0 — ¥ ~ 2 G 0. They are 
called the cohomology groups of ¥, and denoted by H"(X, F). 

Putting together the n steps involved in the definition of H"(X,F), we can 
obtain a noninductive definition. An exact sequence 


09 F¥ 92270 2 


of sheaves in which the 2; are flabby is called a flabby resolution of F. Applying 
the functor F(F) = F(X) to it and leaving out the first term, we obtain a cochain 
complex 


2o(X) > 2X) 0 9 2X) >; 


the cohomology of this provide the groups H"(X, F). 

Applications of sheaf cohomology are related to certain finiteness theorems 
concerning them. The first of these shows that in many situations a sheaf has 
only a finite number of nonzero cohomology groups. 


Theorem. If X is an n-dimensional manifold and ¥ is any sheaf on X then 
H4(X, F) = 0 for any q > n. 


The second finiteness theorem relates to the case when all the groups ¥(U) 
are vector spaces (over R or C), and the restriction homomorphisms pj are linear 
maps. Then the H*(X,¥) are also vector spaces, and one can ask about their 
dimensions. Of special interest is the dimension of H°(X, F) = F(X), usually the 
most important invariant. Generally speaking, this dimension is infinite even in 
the simplest cases; for example, for the sheaves My or Oa;,,, when H°(X, Og) is 
the space of all continuous, and H°(X, O,;¢,) of all differentiable functions on X. 
However, there are important cases when the corresponding spaces are finite- 
dimensional. Suppose for example that X is the Riemann sphere. Then H°(X, @,,,) 
is the space of functions that are holomorphic at every point (including the 
point at infinity). By Liouville’s theorem such a function is constant, so that 
H°(X,0,,) = C is 1-dimensional. The same thing holds for any compact con- 
nected complex analytic manifold X: on it H°(X, @,,,) = C. It can be proved that 
in this case all the cohomology groups H4(X, @,,,) are also finite-dimensional over 
C. The same holds for the sheaf of holomorphic differential forms or holomorphic 
vector fields on a compact complex analytic manifold, and for the sheaf Fp of 
Example 7 if the Riemann surface X is compact. We restrict ourselves here to 
the above examples, and do not state the general theorem on this subject. 

In all cases when X is a finite-dimensional manifold and the spaces H4(X, F) 
are finite-dimensional, we can define the Euler characteristic of a sheaf F: 


W(X, F) = ¥(—1)' dim H4(X, F) (17) 


(the sum consists of a finite number of terms, in view of the first finiteness theorem 
stated above). 
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This new invariant plays a two-fold role. First of all, the Euler characteristic 
of certain standard sheaves, which are related to the manifold in an intrinsic way, 
give invariants of the manifold itself. For example for a compact Riemann surface 
X, 


UX, Onn) =1-— P, 


where p is the genus of X; note that 1 — p is one half of the topological Euler 
characteristic y(X, R) of X. In a similar way, for any compact complex analytic 
manifold X, the Euler characteristic y(X, @,,,) gives an important invariant, the 
arithmetic genus of X. 

On the other hand, the Euler characteristic turns out to be a ‘coarse’ invariant, 
and easy to calculate. In the majority of cases we are mainly interested in the 
dimension of the first term dim H°(X,¥) in the sum (17), but this is already a 
more delicate problem, which can be solved, for example, if we manage to prove 
that all the remaining terms are zero. Thus, in the case of the sheaf .¥, of Example 
7 associated with a divisor D = ) n,x; on a compact Riemann surface X, the 
Euler characteristic y(X,.¥p)) does not depend on the individual choice of the 
points x;, but only on their number d = ¥ n;: 


U(X, Fp) = U(X,O,,) +d=1—pt+d. (18) 


On the other hand, it can be proved that H4(X, Fp) = 0 for q > 2, and that if 
d > 2p — 2 then also H'(X, F,) = 0, so that 


dim F,(X)=1—p+d for d>2p—2. (19) 


Recall that ¥p(X) is the space of functions which are meromorphic on X and 
have poles at x; of order at worst n;. The equality (19) is first and foremost an 
existence theorem for such functions. If we have such functions at our disposal, 
then we can construct maps of Riemann surfaces to one another, and study the 
question of their isomorphism, and so on. As the simplest example, suppose that 
p = 0 and D = x is one point. From (19) we get that ¥)(X) is 2-dimensional; 
since the constant functions form a 1-dimensional subspace, we see that there 
exists a meromorphic function f on X having a pole of order 1 at x. It is not 
hard to prove that the map defined by this function is an isomorphism of the 
Riemann surface X with the Riemann sphere; that is, a Riemann surface of genus 
0 is analytically isomorphic to the Riemann sphere (or in other terms, con- 
formally equivalent to it). 


Example 8. The analogue of the sheaf ¥, of Example 7 can be constructed for 
an arbitrary n-dimensional complex analytic manifold X. For this we replace the 
points x; by (n — 1)-dimensional complex analytic submanifolds C,, set D = 
)n;C;, and take ¥,(U) to be all functions which are meromorphic on U and 
have poles only along the submanifolds U AC, c U of order <n,. Even in this 
much more general set-up, the same principle holds: we can consider the sub- 
manifolds C; as (2n — 2)-dimensional cycles, so that D defines a homology class 
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)n;C; in H,,-2(X, Z), and the Euler characteristic y(X, Fp) depends only on this 
homology class (for n = 1, the homology class of the 0-dimensional cycle )'n;x; 
is the number d = )'n,). The ‘coarse’ nature of the Euler characteristic is ex- 
pressed by the fact that it is a topological invariant, depending only on an element 
of the discrete group H,,_,(X, Z). There is a formula analogous to (18) in this 
case too, but of course it is much more complicated. The relation (18) is called 
the Riemann-Roch theorem, and the same name is used for its generalisation 
which we have just mentioned. 
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A. Topological K-theory 


We now make further use of the notion of a family of vector spaces f: E + X 
over a topological space X introduced at the end of §5. A homomorphism 
go: E> E’ ofa family f: E > X into a family f’: E’ > X is a continuous map @ 
which takes the fibre f ~'(x) into (f’) +(x) and is linear on the fibre for each point 
x € X. If o defines an isomorphism of the fibres, then ¢@ is called an isomorphism. 

For each open subset U c X and family f: E— X of vector spaces, the 
restriction f~'(U) defines a family of vector spaces over U. 

The simplest example is the family X x C", where f is the projection to X, and 
is called the trivial family. The main class of families which we will consider is 
the class of (complex) vector bundles. By this we mean a family which 1s locally 
trivial, that is, such that every point x e X has a neighbourhood U for which the 
family f~'(U) is isomorphic to the trivial family U x C". For an arbitrary 
continuous map @g: Y — X and a vector bundle E on_X, the inverse image o*(E) 
of E is defined; the fibre of @*(E) over y € Y is identified with the fibre of E over 
@(y). The precise definition is as follows: @*(E) consists of points (y,e)e Y x E 
for which @(y) = f(e). The set of isomorphism classes of vector bundles over a 
given space X is denoted by ¥ec(X). By means of ¢*, this is a contravariant 
functor Vec from the category of topological spaces to the category of sets. 

In Vec(X) we can define operations E @ F and E ® F, which reduce to taking 
direct sum and tensor products of fibres over a point of X. The operation @ is 
commutative and associative, but it does not define a group, since there is 
obviously no negative of an element. In other words, Wec(X) is a commutative 
semigroup (written additively) with a 0, the fibre bundle X x {0}; see § 20, 
Example 9. We can try to make it into a group in the same way that we construct 
all the integers from the nonnegative integers, or the rationals from the integers. 
Here we should note one ‘pathological’ property of addition in Vec(X): from 
a@®c=b@Qc it does not follow that a = b. In view of this, the required group 
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consists of all pairs (a, b) with a, b € Wec(X), with pairs (a, b) and (a’, b’) identified 
if there exists c such that a@ b'@®c =a' @b@c. Addition of pairs is defined 
component-by-component. It is easy to see that the set of classes of pairs defines 
a group, in which the class (a, b) equals the difference (a, 0) — (b, 0). The group we 
obtain is denoted by K(X). Taking a fibre bundle a € Vec(X) into the class of the 
pair (a, 0) we get a homomorphism 


Vee(X) + K(X), 


and a, b € ¥Yec(X) map to the same element of K(X) only if there exists c € Vec(X) 
such thata ®c = b @c. It is easy to see that K(X) defines a contravariant functor 
from the category Zo% into the category of Abelian groups. For example, if X is 
a point, then Wec(X) just consists of finite-dimensional vector spaces; hence an 
element of Wec(X) is determined by its dimension n > 0, and K(X) = Z. In the 
general case, the study of the group K(X) is ‘linear algebra over a topological 
space X’. 

From now on we restrict attention to the category @ of compact topological 
spaces with a marked point. For these, it can be proved that if gp: Y> X isa 
homotopy equivalence then g* defines an isomorphism of the groups K(X) and 
K(Y). Thus the functor K(X) extends to the category #@, of compact spaces up 
to homotopy equivalence (with a marked point). If x) € _X is a marked point and 
f:E-X a fibre bundle, then taking E to the dimension of the fibre f~'(xo) 
defines a homomorphism 


WY: K(X) — Z. 


(we can say that ¥ = @* where @: x) <, X is the inclusion). The kernel of ¥ is 
denoted by K(X). This construction is analogous to the introduction of the group 
H°(X, A) in connection with the exact sequence (2) of § 21. It is easy to prove that 


K(X) = Z@ R(X). 


If Y < X is a closed subset with x, € Y then the inclusion map f: (Y, x9) G 
(X,xX,) and the contraction of Y to a point g: X + X/Y (where the image of Y is 
the marked point of X/Y) define the homomorphisms in a sequence 


K(X/Y) > K(X) > K(Y), 


and it is not hard to prove that this is exact. We arrive at the question already 
discussed in § 21 of extending this sequence to an infinite exact sequence. 

In the present case, the question can be solved as follows: write SX for 
the reduced suspension of the space X (for the definition, see §20, Example 
7). By induction, we define K°(X) = K(X) and K~"(X) = K~"*!(SX). Then the 
sequence 

2 Ko UY) 


K-"(X/Y) > K-"(X) 3 K-"(Y) > 
K-"*1(X/Y) yee 
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is exact (under suitable definitions of the homomorphisms K~"(Y) > K~"*!(X/Y), 
which we omit). This definition can be explained as follows. Replace the reduced 
suspension of X by the suspension 2X (once more, see § 20, Example 17 for the 
definition), which has the same homotopy type. Write CX for the cone over X, 
that is (X x D/(X x 1). Viewing X as the base of the cone X x 0 < CX, wecan 
say that 2X = CX/X. The cone CX has the homotopy type of a point, since it 
can be contracted to its vertex; hence the inclusion X G CX is an analogue of 
representing a module M as the homomorphic image of a projective module by 
f: P— M; and CX/X is an analogue of Kerf. Thus our definition is similar to 
the inductive definition of the functors Ext” given in §21.B; or more precisely, 
dual to it (the inclusion of X and the surjection onto M have changed places), 
which is indicated by the negative indexes in K~"(X). 
A remarkable property of this “cohomology theory’ is that it is periodic 


R-"(X) = R-"*2(x). (1) 


We cannot stop here to discuss the proof of this periodicity theorem. Periodicity 
allows us to extend our sequence of functors K"(X) in a natural way to positive 
values of n, preserving condition (1). The ‘cohomology theory’ arising in this way 
is called K-theory. Of course, there are essentially only two functors K°(X) and 
K'(X) in it. By definition K°(X) = K(X) and K(X) = K(SX). 

We give another interpretation of the functor K‘(X). For this, we once more 
replace the reduced suspension SX by the usual suspension 2X, and recall that 
it can be obtained by glueing together the two cones 


C,X =(X x [0,1/2])(X x 0) and C,X =(X x [1/2,1](X x 1) 


along their bases X x 1/2. On each cone, a fibre bundle E is isomorphic to the 
trivial bundle (since the cone is contractible). Hence the fibre bundle E over 2X 
is obtained by glueing together the bundles C" x C,X and C" x C,X along 
C" x X. This glueing is realised by an isomorphism @, of the fibres C” over the 
corresponding points of the base of the cone, that is, by a family of maps 
~,.€ GL(n, C) for x € X, or a continuous map X — GL(n, C) given by x Q,. 
From these considerations we get the interpretation we need: 


K(X) = K(SX) = H(X,GL). (2) 


Here H( , ) denotes the set of morphisms in the category #07, and the somewhat 
indeterminate symbol GL denotes that we must take maps into GL(n,C) for 
arbitrarily large n. We can embed the groups GL(n,C) into one another, 


GL(n, C) — GL(n + 1,C) by Ab (< i) and take their union; this will be our 
space GL. Oo! 

K-theory, like any other cohomology theory, gives a certain projection of the 
homotopic topology into algebra. In the given case, the projection reproduces 
the original in a very faithful way, since the K-functors have a whole series of 
operators arising from the operations of external and symmetric powers of vector 
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spaces, and these operations are functorial, that is, compatible with the maps 
K"(X) > K"(Y) corresponding to a continuous map f: Y > X. It often happens 
that, putting all this information together, one gets a contradition; these are 
theorems on the non-existence of some kind of maps. For example, K-theory 
provides the simplest proof of the theorem stated at the end of § 19 that a division 
algebra of finite rank over the real number field has rank either 1, 2, 4 or 8. 
Another famous application of K-theory is a solution of the old question of how 
many linearly independent vector fields there exist on the n-dimensional sphere 
(that is, how many vector fields 0,,..., 6, such that at any point x the correspond- 
ing vectors 6,(x), ..., 0(x) are linearly independent). If the exact power of 2 
dividing n + 1 is 2” then there are 2r of them if 4|r, 2r — 1 ifris of the form 4k + 1 
or 4k + 2, and 2r + 1 ifr is of the form 4k — 1. 

But the most beautiful application of K-theory relates to the question of the 
index of an elliptic operator. Linear differential operators of any finite order on 
a differentiable manifold X were defined in § 7, Example 3. In a local coordinate 
system they are of the form 


git +i,, 


G= » aj, _..i,(%) 


= 
ipte +i, <k Oxy... 0x;" 


(3) 


where a; __; (x) are differentiable complex-valued functions. A m, x m, matrix 
(Z,;) of differential operators defines a differential operator on the trivial vector 
bundles 


BGXxC™AX x C™. (4) 


It is not hard (using local triviality) to extend this definition to operators 
Y. E > F acting on any differentiable vector bundles, but for brevity will restrict 
ourselves to (4). 

Let us determine what the operator (4) gives us at one point of a manifold X. 
For this we need to fix a point x in (3), so that the coefficients a;__; (x) become 


constants. The operators 


< = €, are elements of the tangent space T, of X 
(see § 5, Example 13), and & gives us a polynomial P(é,x) = ).a,,, éy'... Gin on 
the cotangent space 7,*. The operator (4) defines an m, x m, matrix of such 
polynomials (P,(¢,x)). Let k be the maximum of the degrees of all of the 
polynomials P,(¢, x), and PE, x) their homogeneous parts of degree k. If m, = 
m, = mand det(P,(é, x)) 4 0 for € # 0 then the operator (4) is elliptic at x, and 
if this holds for all x € X then it 1s an elliptic operator on X. 

Thus an elliptic operator 9 defines at each point x € X and for each € € T* 
with € £0 a linear transformation (P(E, x)) € GL(m), which we denote by 
g(€,x). The map (€,x)t>o9(€, x) is defined for € € T* with € 40. In other 
words, it is defined on the manifold Tf x s, where Tf is the cotangent bundle, 
and s is the zero section of T,* consisting of the zero point in each fibre. The vector 
space R" ~ {0} is homeomorphic to R, x S"~', and hence has the homotopy type 
of a sphere S""'. Thus we can say that og(é, x) is defined on some fibre bundle 
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Sy over X (not a vector bundle) whose fibres are spheres S"~', and gives a map 
Og: Sy ~ GL(m, C). (5S) 


This map is called the symbol of the elliptic operator J, and its homotopy class 
is the most important topological invariant of 9. By (2), og € K1(S,), and this 
already establishes the connection with K-theory. 

We now move on to state the index problem. The operator (4) gives a map 
A(X)™ — A(X)", where A(X) is the ring of differentiable functions on X. Its 
kernel is denoted by Ker c A(X)”, and its image by ImJY c A(X)”; the 
quotient A(X)"2/Im @ is called the cokernel and denoted by Coker Z. It is proved 
in the theory of elliptic operators that for an elliptic operator J, the spaces Ker 9 
and Coker Y are finite-dimensional. In other words, the space of solutions 
Of = 0 for fe A(X)" is finite-dimensional, and the number of conditions on g 
for the equation Df = g to be solvable with fe A(X)” is finite. The difference 
between these dimensions, 


Ind 9 = dim Ker 9 — dim Coker 9 (6) 


is called the index of the elliptic operator @. 

The index theorem asserts that the index of an elliptic operator Z depends only 
on its symbol og, and it gives an explicit formula expressing Ind 9 in terms of 
dg. A little more precisely, we can say that the space GL has certain special 
cohomology classes, that do not depend on anything else. The map og allows us 
to transfer these to Sy. On the other hand, S, also has certain special cohomology 
classes which are entirely independent of the operator 9. Finally, on the 
cohomology ring H*(S,) there is a standard polynomial in all the cohomology 
classes so far mentioned, which gives a class «1g ¢ H7" ‘(S,,Z) of maximal 
dimension 2n — 1 = dim Sy. It is well known from topology that H?""1(S,, Z) = 
Z, and hence the class ag is given by an integer, which turns out to be equal to 
Ind 9. Although we have only spoken of operators on trivial bundles, the index 
theorem holds for elliptic operators on arbitrary fibre bundles. 

Already the qualitative fact that the index depends only on the symbol of the 
operator Y (and not on its more delicate analytic properties) shows that the 
difference (6) is ‘coarse’, in the same way that the Euler characteristic § 21, (17) is 
coarse. This is not a chance analogy. The index theorem, applied to complex 
analytic manifolds and certain very simple operators over them, implies the 
Riemann-Roch theorem mentioned in § 21.C. 


B. Algebraic K-theory 


We mentioned in §5 the analogy between families of vector spaces f: E > X 
and modules over a ring A. In particular, as we have seen, a family E > X defines 
a module over the ring @(X) of continuous functions on X. In this analogy, which 
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are the modules corresponding to vector bundles? There are many arguments 
indicating that these are the projective modules of finite rank (compare § 21.B). 
First of all, this is shown by the following result. 


Theorem J. Let X be a compact topological space and E + X a family of vector 
spaces. The module M over @(X) corresponding to this family is projective if and 
only if E is a vector bundle; E<> M defines a 1-to-1 correspondence between vector 
bundles over X and finitely generated projective modt'les over 6(X). 


We can also indicate certain algebraic properties of finitely generated projective 
modules which are analogues of local triviality. 

This justifies the introduction (by analogy with § 22.A) of the semigroup //(A) 
whose elements are classes of finitely generated projective modules over a ring 
A and whose sum is the direct sum of modules. Repeating word-for-word the 
arguments of §22.A, we can construct a group K(A) and a map @g: //(A) > K(A) 
such that the set g(J7(A)) generates K(A), and such that for two projective 
modules P, Q € /I(A), their images g(P) and g(Q) are equal if and only if there 
exists a third module R € /7(A) for which P@ RZOOR. 

For any prime ideal J c A, the ring A/I can be embedded in a field k, and 
hence there exists a homomorphism A — k with kernel J. Thus k is an A-module 
and the module M @, k is defined. If M is finitely generated, then M @,k is a 
finite-dimensional vector space over k. It can be proved that for an integral 
domain A and a projective module M, the dimension of this space is independent 
of the choice of ideal J, and for J = 0 is equal to rank M (compare § 5). The 
function rank M extends to K(A) and defines a homomorphism K(A) — Z whose 
kernel is denoted by K(A). It is easy to see that K(A) = K(A) @ Z. 

We consider the groups K(A) and K(A) for some very simple rings. 


Theorem II. If A = k is a field, then IT(k) consists of finite-dimensional vector 
spaces over k, the homomorphism K(k) — Z is defined by dimension, and is obviously 
an isomorphism, so that K(k) = 0. 


Theorem III. [f A is an principal ideal domain then for any module M of finite 
type we have M = M, @ A' where Mo is a torsion module (see § 6, Theorem II). If 
M is projective, then it is a direct summand of a free module, and hence is 
torsion-free. Hence M, = 0 and M ~= A’, and this again means that K(A) = 0. 


Consider the ring A of numbers of the form a+b./—5 for a, beZ 
(§4, Theorem VIII). We saw in §4 that the ideal P = (3,2 + ./ —5) is not 
principal. It is not hard to show that g(P) — @(A) € K(A), and o(P) # @(A), so 
that K(A) 4 0. 


Theorem IV. If A is the ring of integers of any algebraic number field (see the 
end of § 7), the group K(A) is finite and isomorphic to the group of ideal classes of 
this field (§ 12, Example 1). In particular, for the ring A of numbers of the form 


a+b./—5 with a,b eZ we have K(A) = 2/2. 
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Theorem V. If X is a compact topological space then the group K(@(X)) defined 
in this section is isomorphic to K(X) defined in §22.A. 


The definition of higher K-functors K,(A) proceeds along already familiar 
lines. First of all a group K(J) is also defined for an ideal J < A (the definition 
given above is not applicable, since we always considered rings with 1). Then we 
construct an exact sequence 


K(I) > K(A) > K(A/D (7) 


and finally, we define groups K,(A) such that K,) = K, and which extend (7) to 
an infinite exact sequence 


aad K,4,(A/D md K,(D me K,(A) > K,(A/I) - K,-,U) mane 


(in algebraic K-theory the functors K,, are covariant, so that the index is written 
as a subscript). We will not state all of these definitions, but merely give inter- 
pretations of some of the groups which arise in this way. 

The interpretation of the groups K,(A) is analogous to that which gives 
relation (2) in the topological case. A continuous map @g: X > GL(n) is an 
invertible matrix whose entries are continuous functions of a point of X, that is 
an element of GL(n, @(X)), where @(X) is the ring of continuous functions of X. 
Thus the natural starting point should be the group GL(n, A), and as in § 22.A, 
the infinite limit GL(A) of these as n > oo. But now we must also interpret the 
letter H in formula (2), that is, recall that maps @: X > GL are considered up to 
homotopy. This means that we consider the quotient group GL(A)/GL(A)p, 
where GL(A), is the connected component of the identity in GL(A). What is the 
analogue of this subgroup in the algebraic case? In many questions, matrixes 
playing the role of transformations that are ‘trivially deformable to the identity’ 
are given by 


E + aE;, (8) 


where E,, is the matrix with 1 in the (i,j)th place and 0 elsewhere; matrixes of the 
form (8) are called elementary matrixes. For example, the proof that the group 
SL(n, R) is connected is based on the fact that any element of it is a product of 
elementary matrixes. The subgroup generated by all elementary matrixes in 
GL(A) is written E(A). A fact which is unexpected, although quite elementary to 
prove, is that E(A) is the commutator subgroup of GL(A). The proof of this uses 
in an essential way the fact that we are considering the union of all the groups 
GL(n, A) for n = 1, 2,...; for each individual group this is not true in general. In 
particular, the group GL(A)/E(A) is commutative. This gives what we need: 


K,(A) = GL(A)/E(A). 


Passing to the determinant gives a homomorphism GL(A)/E(A) — A* and even 
a representation 


K,(A) = A*@®SK,(A) where SK,(A) = SL(A)/E(A) 
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(where admittedly the additive and multiplicative notation are hopelessly 
confused). 

From a course of elementary linear algebra, we know that SL(n, k) = E(n,k) 
for a field k; this follows from Gauss’ method of solving linear equations by row 
and column operations. Hence SK,(k) = 0. If A is an integral domain with a 
Euclidean algorithm then the main lemma of §6, which was the basis for the 
proof of the structure theorem of modules of finite type, gives the same result: 
SK ,(A) = 0. The group K, occurs (and first occurred) in topology in the case of 
the group ring A = Z[G] of a finite Abelian group G (with G the fundamental 
group of some manifold). In this case it is often nontrivial. Generally speaking, 
the homomorphism GL(A) > K,(A) is a kind of ‘universal determinant’. In this 
form it can also be generalised to the case of a noncommutative ring A. 

We only describe the group K, when A = kis a field. It can then be given by 
generators {a,b} corresponding to any elements a, be k with a, b #0. The 
defining relations are of the form: 


{41 42,5} — {4,,b} {a,b}, (9,) 
{a,b,b,} = {a,b, } {a, b3}, (9,) 
{a,1—a$=1 fora#0Oor tl. (93) 


The group K.,(k) has an especially vivid application to the description of 
division algebras of finite rank over k (compare § 11 and § 12, Example 3 for the 
definition of the notions occuring here). 

It can be shown that for an arbitrary field k all the elements of the Brauer 
group Br(k) have finite order: if dim, D = n? (see §11, Theorem IV), then the 
element corresponding to D in the Brauer group has nth power equal to 1. This 
proves in particular that the generalised quaternion algebras (a, b) introduced in 
§ 11 define elements of Br(k) of order 2 or 1. 

We only give a precise description of the relation between K ,(k) and the Brauer 
group Br(k) for elements of order 2 of these groups. In any Abelian group C, the 
elements satisfying c* = 1 obviously form a subgroup: we denote it by ,C. The 
elements of the form c* for c € C also form a subgroup; we denote it by C7. We 
now send any generator {a,b} of the group K(k) with presentation (9) with a, 
b ek with a, b £0 into the generalised quaternion algebra (a, b). It is not hard 
to check that relations (9) are satisfied in the Brauer group: checking (9,) and 
(9,) is a simple exercise in tensor multiplication, and the proof of (9;) follows 
from the fact that the algebra (a, b) defines the identity element of Br(k) if and 
only if ax? + by? is solvable in k (see § 11, (4)); but a- 17 + (1 — a): 17 = 1! Thus 
we get a homomorphism @,: K ,(k)— Br(k). As we have seen, o,(K(k)) < ,Br(k), 
and hence ~,(K ,(k)*) = 1. In consequence we get the homomorphism 


p: K2(k)/K2(k)* > ,Br(k). (10) 


The main result is that (10) is an isomorphism. This is a very strong assertion: in 
view of the description (9) of K,(k) we get a presentation of ,Br(k) by generators 
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and relations, and indeed for a completely arbitrary field k. There is a similar 
description for the group ,, Br(k), consisting of elements c € Br(k) for which c” = 1. 
Since all elements of Br(k) are of finite order, Br(k) = |_), Br(k), so that as a result 
we get an explicit description of the whole of this group. 

An increasing role is being played by algebraic K-theory in arithmetic 
questions. We give some examples which do not more than hint at one direction 
of such applications: the relation between K-theory and the value of zeta- 
functions. The classical Riemann zeta-function is defined by the series 


8 


i 
C(s)= 4 — for Res>1, 
1” 


and has an analytical continuation over the whole plane of the complex variable 
s. It satisfies the Euler identity: 


t 
9) = 


in which the product takes place over all primes p. For a finite field F, with q 
elements we define its zeta-function to be 


= (11) 


1 
Cr,(s) = i-q 
Then Euler’s identity (11) can be rewritten in the form 
C(s) = [] 65, (5), (12) 
Dp 


This accords very well in the ‘functional view of a ring’ discussed in § 4, according 
to which we should view Z as a ring of functions on the set of primes numbers 
p, with values in F,. This suggests the definition of a zeta-function analogous to 
(12) for a wide class of rings. 

We now return to K-theory. In the case of finite fields F,, it can be proved that 
all the groups K,(F,) are finite for n > 1. The information about their orders can 
be written out in the following beautiful form: 


|Kam(F,)I 
| Kam+1(F,)| 
For the case of the ring Z, the facts known at present allow us to suppose some 
kind of connection between the values of the Riemann zeta-function ¢(—™m) for 
m > O and the ratios |K,,,(Z)|/|K2m41(Z)|. It is known that ¢(—m) are rational 
numbers for odd integers m > 0, and 0 for even integers >0; and that the group 
K,,,(Z) and K,,,4;(Z) are finite. The relation 


| Kom(Z)| 
|Kom+i(Z)| 
is already false in the simplest case m = 1, since |K,(Z)| = 2, |K3(Z)| = 48, but 


=|Cr(—m)|, for m1. 


= |f(—m)|, foroddm>0 
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((—1) = —7. However, it is not excluded that it holds up to powers of 2. In 
any case, it is proved that the denominator of €(—m) divides |K,,,4,(Z)|. On the 
other hand, if p is a prime number and p > m then the power of p dividing 
|K 5(Z)| is not less than that dividing the numerator of ¢(— m) (under an addition 
condition on p which conjecturally is always satisfied, and which has been 
checked for p < 125,000). Recall that we have already met the values of zeta- 
functions at negative integers in connections with the cohomology of arithmetic 
groups (§ 21, Example 5). This is not a coincidence; here a relation with the groups 
K,(Z) does in fact exist. 

The relations of K-theory with number theory are many and various, but we 
cannot describe them here in more detail, since this would require us to introduce 
complicated technical tools. 
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The whole of this book is based on the interweaving of two themes: the systematic treatment of 
algebraic notions and theories, and the working out of key examples. References for these two themes 
are described separately. It is my impression that, as a rule, the first edition of a book may sometimes 
be fresher and more interesting than subsequent ones, even when these are technically more finished 
in some respects. For this reason references are to the first editions known to me. 

The basic notions of algebra, groups, rings, modules, fields, and the main theories pertaining to 
these notions, including the theory of semisimple modules and rings and Galois theory, are treated 
in the classical two-volume textbook of van der Waerden [104 (1930, 1931)]. Although more than 
half a century has elapsed since the appearance of this remarkable book, it is in no way dated, and 
for the majority of the questions it treats no better source can be found even today. 

The process of isolating out the basic algebraic notions, and recasting algebra in the spirit of an 
axiomatised approach occupied more than a century, and involved the participation of Gauss, Galois, 
Jordan, Klein, Kronecker, Dedekind and Hilbert. But fixing the results of this century-old process 
in the form of the standard language of algebra took scarcely more than a decade, from 1920 to 1930; 
an especially prominent role in this was played by E. Noether. It was at this time that van der 
Waerden’s book appeared. To get a feeling for the change in the whole spirit of algebra and the 
manner of its exposition, it is useful to compare van der Waerden’s book with Weber’s course 
[105 (1898, 1899)], from which algebra had been studied by previous generations. 

Of the more recent literature we must note the books of the Bourbaki series Eléments de 
mathématiques devoted to algebra [16 (1942-—1948)], [17 (1959)]. These books might give the 
impression that they could serve as textbooks for beginners, since their treatment is almost entirely 
self-contained, and starts from the simplest definitions. This impression is however entirely illusory 
in view of their basic principle, to consider the subject in the maximal possible generality, and in view 
of the complete absence of any material motivating the introduction of notions and the direction in 
which the theory is developed. However, the specialist may find a wealth of valuable details in them. 

From the more specialised texts, we note the course of commutative algebra of Atiyah and 
Macdonald [5 (1969) ], which is written bearing in mind also the interests of nonalgebraists. 

The specialised results on the structure of division algebras given in § 11 is treated systematically 
in the survey of Deuring [33 (1935) ] or the book of A. Weil [106 (1935) ]. 

The references we have quoted cover in the main the material of § 2—11 of this book. From § 12 
we go over to group theory; for the foundations we can here again recommend the book of van der 
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Waerden and (for a slight extension of the point of view) the chapters on group theory of H. Weyl’s 
classical monograph [107 (1928)]. Although intuitive objects such as transformation groups appear 
as examples of groups, the example that was most stimulating for the development of the general 
notion of a group and for transforming group theory into an independent subject was the permutation 
group of a finite set, more specifically, the set of roots of a polynomial; ideas going back to Lagrange 
and Abel took concrete form in the works of Galois [45 (1951)]. Very clearly visible in this is the 
growth of understanding that questions of field theory relate to the Galois group specifically as an 
abstract group, despite the fact that the group is realised as a concrete permutation group. (Lagrange 
expressed this idea by saying that ‘permutations are the metaphysics of equations’.) A further stimulus 
was Gauss’ systematic use of congruence classes and classes of quadratic forms, and his definition of 
operations on these [Gauss 46 (1870) ], creating the feeling that some general notion lay concealed 
beneath the surface. 

The first known book on the subject of group theory was that of Jordan [70 (1870) ], which contains 
a wealth of examples and ideas, and has not lost its value to the present day; this book treats finite 
transformation groups only. Starting with the work of Klein and Lie, considerations of infinite 
discrete and continuous groups come to the fore. Here we first have occasion to refer to Klein’s 
wonderful book ‘Lectures on the development of mathematics in the 19th century’ [73 (1926) ]; the 
period in the development of group theory under discussion is described here from the point of view 
of one of its most influential participants. But the book also contains much else of interest on the 
development of other branches of algebra (and of mathematics as a whole). 

The first book on abstract group theory was Burnside’s book [21 (1897)], which considered only 
finite groups. For a long time subsequent treatments only improved it; the most finished treatment 
was achieved by Speiser [98 (1937) ]. A modern course in the theory of finite groups is the 3-volumes 
text by Huppert and Blackburn [67 (1967), 68 (1982)]. The point of view of infinite groups is given 
most prominence in Kurosh [77 (1955, 1956)]. An interesting historical survey of the theory of 
defining groups by generators and relations is contained in [Chandler and Magnus 24 (1981)]. 
Logical problems arising in this can be found for example in Manin’s course [81 (1977)]. 

The first book on the theory of Lie groups is the 3-volume book of Lie and Engel [79 (1883-1893) ]; 
this book is interesting as a witness of the birth of a new branch of science. A more modern treatment 
of the main notions can be found in Pontryagin [90 (1938) ], and an even more modern one (that is, 
wherever possible without the use of coordinate systems) in Chevalley [25 (1946)]. A beautiful 
treatment is also given in [Hochschild 64 (1965) ]. 

For algebraic groups we note the survey of Chevalley [28 (1958)] and the books [A. Borel 12 
(1969)], [Humphreys 66 (1975)] and [Springer 99 (1981)]. The classification of simple Lie groups 
can be found: for compact groups in [Zhelobenko 111 (1970)] and [Pontryagin 90 (1938)], for 
complex groups in [Séminaire Sophus Lie 94 (1955)], for real Lie groups in [Goto and Grosshans 50 
(1978) ]. The classification of simple algebraic groups is contained in Chevalley’s seminar [27 (1958) ] 
and the books [Humphreys 66 (1975)] and [Springer 99 (1981)]. For finite simple groups we note 
the survey [Tits 103 (1963) ] and Gorenstein’s book [49 (1982) ] (although this book does not contain 
a proof of the classification—a unified exposition of this does not at present exist). 

The foundations of the theory of representations of finite groups were laid by Frobenius; one 
can consult his collected works [44 (1968)] for this. A more recent treatment can be found in the 
appropriate sections of H. Weyl’s book [107 (1928)] and the early part (§§ 1-8) of [Serre 95 (1967) ]. 
For the representations of compact groups see also the books of [Weyl 107 (1928)], [Pontryagin 90 
(1939)], [Chevalley 25 (1946)] and [Zhelobenko 111 (1970)]. A wide survey on the theory of repre- 
sentations of Lie groups is given in [Kirillov 72 (1972)]. An introduction to more recent questions 
is provided by the conference proceedings [6 (1979) ] edited by Atiyah. 

H. Weyl’s book [109 (1939)] is a classical study in representation theory. It had a strong 
influence on the subsequent development of this subject. It contains in particular the concept of 
‘coordinatisation’ and the idea of the relation between symmetries and representations which we 
have used in this book. 

Lie theory is included in practically all the textbooks on Lie groups we have quoted. Various stages 
of its development can be seen in [Lie and Engel 79 (1883, 1888, 1893], [Pontryagin 90 (1939)], 
[Chevalley 25 (1946) ], [Hochschild 64 (1965) ]. 
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For the Cayley numbers or octavions, we note the survey of Freudenthal [43 (1951)]. For 
geometrical applications see the survey [Lawson 78 (1985) ]. 

The proof of the main theorem on nonassociative division algebras over the real number field can 
be found in [Atiyah 4 (1967) ]. 

The notions of categories and functors were formulated in books of Eilenberg and MacLane 
[38 (1942)], [39 (1945)], where their significance as a new language for the axiomatisation of 
mathematics was argued in detail. A systematic treatment of the basic notions of category theory 
can be found in Chapter II of the book of Hilton and Stammbach [61 (1971)]. Certain aspects of it 
are considered in the first sections of Grothendieck’s article [51 (1957)]. A detailed discussion of the 
notion of a group object in a category is contained in his book [53 (1961) ], Chapter 0, § 8. 

A systematic treatment of the foundations of homological algebra is contained in [Hilton and 
Stammbach 61 (1971)]. The classical work in this area is the book of H. Cartan and Eilenberg 
[22 (1956) ], but this is written more abstractly. The theory of group cohomology is contained in 
Brown [20 (1982) ] in a spirit similar to that of this book. The general notions of the theory of sheaf 
cohomology are treated in [Tennison 102 (1975)}. Hirzebruch’s book [62 (1956)] is a classical 
textbook, devoted in the main to applications to the Riemann-Roch theorem. 

The part of K-theory treated in this book is covered for the most part by two surveys: topological 
K-theory by Atiyah [4 (1967)], and algebraic K-theory by Milnor [85 (1971)]. For the questions of 
algebraic K-theory treated at the end of §§ 22 we note the survey of Suslin [100 (1984) ], although 
this is not written for a general audience. 

Finally, since on many occasions we have used topological notions and results (especially in 
the sections on category theory and homological algebra), we give some topological references. 
Textbooks written from a point of view similar to that of §20 of this book are Dold [36 (1980) ] and 
Switzer [101 (1975)]}. But in places where a more important role is played by geometrical intuition, 
for example in connection with the topology of surfaces, the old book of Seifert and Threlfall [93 
(1934)] remains irreplaceable. The theory of differentiable manifolds and integrating differential 
forms on them can be found in the books of Chevalley [25 (1946) ] and de Rham [92 (1955) ]. 

We now proceed to the literature referring to the detailed workings of individual examples. Perhaps 
the theme surfacing throughout the book most richly illustrated with examples is the ‘duality’ between 
the functional and the algebraic point of view, the intuition of the elements of a ring as ‘functions’ 
on the set of its (maximal or prime) ideals, the analogy between numbers and functions. This is a 
very old complex of ideas. Properly speaking, the idea of analytically continuing functions from the 
real line to the complex plane already raises the question of some ‘natural’ set on which a function 
should be considered. A big step forward in this direction was the creation of the idea of a Riemann 
surface. In the article of Dedekind and Weber [31 (1982) ], the Riemann surface of an algebraic func- 
tion field in 1 variable (in our notation, the field K(C), where C is an algebraic curve) is defined in a 
purely algebraic way as a set of ‘homomorphisms’ of K(C) into K (with a symbol oo adjoined to K). 
The article of Kronecker [76 (1982) ], published in the same issue of the journal, develops a programme 
for constructing a theory which unifies algebraic numbers and algebraic functions in any number of 
variables. A discussion of the idea of the parallelism between numbers and functions can be found 
in Klein’s ‘Lectures’ [73 (1926)]. A treatment of the theory of algebraic functions of 1 variable along 
ideas of the article of Dedekind and Weber is given in Chevalley’s book [26 (1951)]. 

In connection with questions of point-set topology and logic, it was proved that a Boolean algebra 
is representable as the ring of continuous functions with values in F, on a certain type of topological 
space (for this see [Birkhoff 10 (1940)]). One can learn about the same ideas applied to rings of 
continuous real- or complex-valued functions in [Gel’fand, Raikov and Shilov 47 (1960) ]; for rings 
of C® functions see [Brécker 19 (1975) ], and for analytic functions [Hoffman 65 (1962) ]. Finally, the 
concept of scheme, embracing both number theory and algebraic geometry and allowing geometric 
intuition to be applied to number-theoretical questions was developed by Grothendieck. For this 
see his survey article [Grothendieck 52 (1960)], the lectures of Manin [83 (1970)] and Chapter 5 
of the book [Shafarevich 96 (1972)]. Carrying over infinitesimal methods into the area of number 
theory, in particular the construction of the p-adic numbers comes under this heading. For an 
elementary introduction (also to the theory of rings of algebraic integers) see [Borevich and 
Shafarevich 15 (1964) ], for the deeper theory, the book [Weil 106 (1967) ]. 
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Another theme running through the book is ‘coordinatisation’, in the narrow sense of introducing 
coordinates in the plane and in projective spaces. For this (in particular for the role of Desargues’ 
and Pappus’ axioms) see Hilbert’s book [59 (1930)], and in a more algebraic form, the books 
fE. Artin 3 (1967)] and [Baer 7 (1952)]. Continuous geometries are the subject of von Neumann’s 
book [87 (1960) ]. 

The finite fields, which we have frequently encountered, were discovered by Galois; the complete 
theory of these is already contained in his works [45 (1951)]. Their applications (especially the 
applications of algebraic geometry over finite fields) to coding theory is treated in the surveys 
{Goppa 48 (1984) ] and [Manin and Vlehduts 82 (1984)}. 

Algebraic methods in the theory of commuting differential operators began with the results treated 
in § 5.5 of the book [Ince 69 (1927)]. These results were forgotten and rediscovered several decades 
later. For a modern survey see [Mumford 86 (1977) ]. 

For ultraproducts see [Barwise and others 8 (1970) ]. 

Tensor products and exterior and symmetric powers of modules are defined in [Kostrikin and 
Manin 75 (1980)] and [Bourbaki 16 (1942—1948)]. Properties of completion are treated in [Atiyah 
and Macdonald 5 (1969) ]. 

Clifford algebras appear in a large number of examples. These were introduced by Clifford in the 
19th century (see his collected works [29 (1982)]) and rediscovered (in a particular case) by Dirac in 
the 20th century [35 (1930)], in connection with the attempt to represent a second order linear dif- 
ferential operator as the square of a first order operator with matrix coefficients. A detailed modern 
treatment can be found in [Bourbaki 17 (1959) ]. 

We proceed to the examples concerning the notion of group. A discussion of the notion of symmetry 
is the subject of H. Weyl’s book [110 (1952)]. For the relation between symmetries and conservation 
laws in mechanics (E. Noether’s theorem), see [Courant and Hilbert 30 (1931)] or [Arnol’d 2 (1974)]. 
The symmetries of physical laws are discussed in the interesting lectures of Feynman [41 (1965) ]. 

As for examples of groups not realised as transformation groups, the Ext(A, B) are discussed in 
any of the courses in homological algebra quoted, the Brauer group in [Deuring 33 (1935)], and the 
ideal class group in [Atiyah and Macdonald 5 (1969)]. 

Platonic solids and their connection with finite groups of motion are considered in detail in 
Hadamard’s book [54 (1908)]. Finite subgroups of the group of fraction-linear transformations of 
the complex plane are treated in another book of Hadamard [55 (1951)]. Symmetries of lattices are 
analysed in [Klemm 74 (1982)]. 

A detailed analysis of finite groups generated by reflections is contained in Bourbaki [18 (1968) ]. 
Amazingly enough, the same diagrams given in § 13, in terms of which these groups are classified, 
also turn up in a whole series of other classification problems (the most important of these being the 
classification of simple compact or complex Lie groups). A survey of these connections is given in 
[Hazewinkel and others 58 (1977) ]. 

Geometric crystallography is the subject of the book of Delone, Padurov and Aleksandrov [32 
(1934)]. A more modern treatment is contained in [Klemm 74 (1982) ], where the groups of ornaments 
and n-dimensional crystallography are also considered. A chapter of Hilbert and Cohn-Vossen’s 
book [60 (1932)] is also devoted to this. A complete list of ornaments which characterise all of the 
17 plane groups can be found in the survey of Mal’tsev [80 (1956) ]. 

All the crystallographic groups were classified by E.S. Fédorov in 1889 and by Schoenflies in 1890 
(independently of one another). In the following year, Fédorov classified all the groups of plane 
ornaments [Fedorov 40 (1891)], displaying what is (for a crystallographer) a highly nontrivial 
understanding of the geometrical character of the problem. It is amazing that a mathematician as 
widely read as H. Weyl could write ([110 (1952)], p. 103-4) ‘... the mathematical notion of a group 
of transformations was not provided before the nineteenth century; and only on this basis is one able 
to prove that the 17 symmetries already implicitly known to the Egyptian craftsmen exhaust all 
possibilities. Strangely enough, the proof was carried out only as late as 1924 by G. Polya, now 
teaching at Stanford’. Even more strange is the fact that the above quotation from Wey! has recently 
been the subject of a series of articles in several issues of the Mathematical Intelligencer [Pedersen 
and others 89 (1983-1984) ]. However, the subject under discussion was the assertion that Egyptian 
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craftsmen knew all 17 symmetries, and none of the participants paid any attention to the untrue 
assertion as to who solved the mathematical problem as to their classification. 

For discrete groups of motion of the Lobachevsky plane and their connections with the theory of 
Riemann surfaces see the book of Hadamard [55 (1951) ]. For the fundamental group, the univeral 
covering and the group of a knot we refer to the old-fashioned, but geometrical, book [Seifert and 
Threlfall 93 (1934) ]. 

For the relations between algorithmic problems of group theory and topology, see [Fomenko 42 
(1983)]. 

Braid groups (but without the name) were first considered by Hurwitz, both in the geometrical 
form (as they are now usually defined), and as fundamental groups, and were subsequently 
rediscovered, much later, separately in each of their realisations. For this see [Chandler and Magnus 
24 (1982) ]. 

For the role of toruses in Liouville’s theorem see Arnol’d’s book [2 (1974)]. The classical compact 
groups are carefully worked out in [Chevalley 25 (1946)]. See [Kostrikin and Manin 75 (1980) ] for 
examples of other Lie groups and important relations between them in special dimensions. 

For algebraic groups and their relations with discrete groups see the survey of A. Borel [13 (1963) ]. 

Helmholtz-Lie theory, given as an example in § 17 in connection with representation theory, is the 
subject of an attractive, although slightly difficult, book of H. Weyl [108 (1923)]. 

For the examples related to the representation of O(4) and the curvature tensor of a 4-dimensional 
Riemaniann manifold, see [Besse 9 (1981) }. 

The representations of SU(2) and their relation with quantum mechanics are treated in H. Weyl’s 
book [107 (1928) ]. 

For the examples given as applications of group theory: a treatment of Galois theory is given in 
[van der Waerden 104 (1930, 1931)]. A brief introduction to differential Galois theory is the book 
of Kaplansky [71 (1957)]. An example of the groups arising in the Galois theory of extensions of 
p-adic number fields (the so-called Démushkin groups) and having a mysterious parallel with the 
fundamental groups of surfaces are treated in the book [Cassels and Frohlich 23 (1967)]. For the 
applications to invariant theory, see [Dieudonné and Carrell 34 (1971)]. 

As far as applications of group representations to the classification of elementary particles 
are concerned, the author is only able to list the references from which he has got to know the 
subject. The main one is the lectures of Bogolyubov [11 (1967)]. An interesting introduction 
is the survey of Dyson [37 (1964)]. The Appendix III to Zhelobenko’s book [111 (1970)] is also 
useful. 

For the interpretation of the equations of motion of a rigid body in terms of Lie groups and Lie 
algebras and generalisations of these relations, see [Arnol’d 2 (1974)] and [Fomenko 42 (1983)]. A 
more complete survey of the theory of formal groups is provided by the book [Hazewinkel 57 (1978) ]. 

A more detailed consideration of the topological constructions given as examples in connection 
with category theory can be found in [Dold 36 (1980)] and [Switzer 101 (1975)]. 

For the homology and cohomology groups of a complex see [Hilton and Stammbach 61 (1971)]. 
De Rham cohomology and a proof of de Rham’s theorem is contained in [de Rham 92 (1955)], 
although de Rham’s theorem can now most simply be proved by means of sheaf theory. 

The basic example in sheaf cohomology is the Riemann-Roch theorem. This is the subject of the 
book [Hirzebruch 62 (1956)]. 

The basic example for topological K-theory is the index theorem. A beautiful introduction to this 
is provided by Hirzebruch’s survey [63 (1965)]. A complete exposition of the proof can be found in 
[ Palais 88 (1965) ]. 

In algebraic K-theory, the theorem on the relation between K, and the Brauer group of a field is 
due to Merkur’ev and Suslin (Suslin 100 (1984)]. Results on the computation of the orders of K, for 
finite fields are due to Quillen [91 (1972)]. For conjectures and results on the order of K,, for rings 
of integers see Soulé [97 (1979) ]. 

To get an impression of the history of the development of algebra and its interaction with the 
whole of mathematics, an invaluable source is Klein’s ‘Lectures’ [73 (1926)]. Many interesting 
observations are to be found among the historical remarks in the Bourbaki books. An interesting 
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study, although devoted to the history of a specialised problem, is the book of Chandler and Magnus 
[24 (1982)]. 
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Translator’s note: After consulting the author and the editors at Springer-Verlag, I have translated 
the references without modification (except for renumbering them into Latin alphabetical order). As 
Shafarevich writes, a book in translation must keep its original spirit, or what’s the point of translating 
it at all? 

I have also decided against adding a detailed list of references in the style of modern algebra 
textbooks—the book makes very well the point that algebra is the birthright of all mathematicians 
and scientists, and that its exposition is too important to be entrusted entirely to professional 
algebraists. In addition, when I suggested that references to classics of the subject might seem 
old-fashioned to modern students, Shafarevich’s reply was that just because we know of other people’s 
bad habits, it doesn’t follow that we should encourage them, does it?. 

However, I have taken the liberty of adding a small number of references to texts that have recently 
appeared; a reader needing further technical references on topics in algebra may also consult the 
references given in these, and in the Bourbaki series on algebra [16 (1942-1948), 17 (1959), 18 (1968) ]. 
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direct product of groups 108, 153, 170, 207 

direct sum of rings 20, 81-83, 207 

——of fields 83, 163 

—-—of modules 34, 79, 207 

—— of representations 76, 88, 89, 162 

Dirichlet character 165 

discrete (= discontinuous) group 124-125 

discrete group in Lobachevsky space 131- 
133, 243 


8, 38, 185, 


— transformation group 125, 138 
—series 177 
— subgroup of R" 125-128 


divisibility theory in a ring 22, 23 

division algebra 66, 72, 77, 78, 83-86, 88, 90- 
96, 200, 237 

——overR 91, 201, 233, 239 

divisor on a Riemann surface 226, 229 

dodecahedron 110-111, 158 

dual category ¢* 206 


Subject Index 253 


dual module M* 39 

—polyhedron 110-111 

elementary particles 158, 162, 174, 185-188 

empty set 84 

elliptic functions 126 

—operator 233 

endomorphism ring End,M 62, 63, 67, 69, 
75, 79, 166 

equivalent extensions 103 

—functors 209 

—Jattices 115 

—representations 76 

error-correcting code 30 

Euclidean algorithm 22, 43, 237 

Euler characteristic 224 

—-—ofagroup 224 

— —ofasheafy(X,F) 228-230, 234 

— equations for rigid body motion 198 

— substitutions 16 

evaluation homomorphism 24, 27, 51 

even Clifford algebra C°(L) 71, 72, 73, 145, 
149 

—permutation 109, 112 

exact sequence 218, 226-228, 231, 236 

exceptional simple Lie groups E,, E,, Eg, G,, 
F, 158, 197, 201 

Extp(L,M) 103, 219-220 

exterior product x a y 39 

— algebra of a module or vector space /\M 
70-71 

— power ofa module A\’M 39 

extension of finite type 46 

— of the group field 92, 198 

—ofgroups 154-155, 223-224 

—of modules 103, 219-220 


factor group: see quotient group 
factorisation 22, 23, 26-27, 61 

family of vector space & + X 40, 230 
Fano’s axiom in projective geometry 86 
Fermions 162 

field 11-17, 26, 29, 72, 235 

—extension L/K 12, 29, 46, 49 


— of (formal) Laurent series K((t)) 16-17, 21, 
56 

— of fractions ofaring 19, 31 

— of meromorphic functions “(X) 16, 47 


— of rational functions K(x), K(x,,..-,X,) 
13, 19, 47, 49, 58, 180 

— of trigonometric functions 16 

—Q,R,C 12, 29, 31, 46, 59, 94 

final object ina category 210 


finite Abelian group 42, 43 

—dimension 41-45, 191 

— extension L/K 48-49, 63, 103, 177 

—field F, 10, 12, 29, 30, 46, 48, 86, 91, 124, 
159, 177, 238, 242 

— geometry 9, 86,91 

—group 64, 100-101, 106, 107, 108-124 

—-— ofalgebraictype 159 

— — of orthogonal transformations 110, 
112-114, 115 

— -— of fractional-linear transformations 
114 

— length 77-78, 153-154 

—rank 41,63 

—reflection groups 119, 122-124 

— sheeted cover 182-183 

— simple groups 159 

finitely generated 

— Abelian group 44, 153 

— algebra or ring over A (= of finite type) 45 

— extension 46-49 

—group 135 

— module (= of finite type) 42 

finitely presented group 135-136 

first order linear differential operators 
62, 189 

flabby (= flasque) sheaf 227-228 

— resolution of a sheaf 228 

flag 110, 168-169 

formal group (law) 197, 206 

—— power series ring K[t], K[x,,...,x,] 21, 
56 

Fourier coefficients 164 

—series 25, 143, 170, 177 

— transform as isomorphism of modules 35 

Fredholm operators 69, 234 

free action ofa group 125, 138 

— generators (= basis) 35 

—group F 134, 135, 137, 138, 203 

—module 34-35, 41, 221 

— product of groups 203, 207 

Frobenius’ theorem (on division algebras 
over R) 91 

function field K(C) 14, 31, 47, 49, 91 

functional view of aring 18, 24, 31, 40, 53-54, 
61, 234-235, 238, 241 

functor 207, 208, 215, 219, 226, 231 

— Wec(X) 230 

fundamental domain for a discrete group 116, 
125 

— group 7(X), 7,(X), 1(X, Xo) 
182, 208 

—- theorem of projective geometry 85 


53, 55, 


136—138, 142, 


254 Subject Index 


fundamental theorem of Galois theory 179-182 


—-—ofinvariant theory 184 


Galileo-Newton group 99-100 

Galois extension 99, 178, 225 

—group 178 

—theory 50, 102, 177, 240 

Gaussian integers 23 

Gauss’ method (row and column operations) 
43, 237 

general linear group GL(n, K),GL(V) 96, 99, 
115, 147, 150, 160, 161, 169-170, 183, 195, 
232 

— — Lie algebra gl(n,K) 190, 195 

generalised Cayley algebra 201 

— quaternion algebra 93-95, 237 

generic equation 181 

generators of algebra or ring 45, 68 

—ofagroup 101-102, 104 

—ofamodule 36, 42 

—andrelations 69, 101~102, 108, 122-123, 
134-138, 139 

geometric construction of algebraic operations 
13, 50, 85, 242 

geranium 130 

germ of afunction 27 

graded algebra or ring 46, 63, 68 

group 100 

— algebra or ring Z[G], K[G] 64, 75, 163, 
222, 237 

—character 162, 167 

— cohomology H"(G, A) 222-225 

— defined by relations 101-102, 108, 122- 
123, 134-138, 139 

— generated by reflections 

— GL(n,F,) 102, 124, 150 

— homomorphism 104 

— object inacategory 209-210 

—ofalgebraictype 159 

— of automorphisms of free module GL/(n, A) 
99, 236 

— of extensions Extp(L,M) 103, 219-220 

—offinitelength 153 

— of an integral ternary quadratic form 132 

—ofknot 138 

—of motions 96, 97, 107, 156 

—oforder<10 152 

— of rotations of 3-space SO(3) 112, 140- 
142, 195 

see also general linear —, knot —, Lie —, 
Lorentz —, orthogonal —, representation of 
—, symmetry —, transformation —, 
unitary — 


119, 122-124, 242 


Hasse-Brauer-Noether theorem (on division 
algebras over @) 95 

Hasse’s theorem (on division algebra over Q,) 
93 

Hasse’s theorem (on division algebra over Q) 
95 

Helmholtz-Lie theorem 168-169, 243 

Hermitian scalar product 80, 168 

Higman’s theorem 136 

Hilbert basis theorem 45 

Hom( ,A) and Hom(A, ) 208, 209 

homeomorphism problem for manifolds 138 

homology H,(K), H,(X), H,(X,A) 102, 208, 
213-216 

homomorphic image (= quotient) 36, 42, 45, 
68 

homomorphism of groups 104 

— of rings or algebras 24, 28, 31, 63, 192 

—ofmodules 36, 74 

—ofsheaves 225 

— of families of vector spaces 40, 230 

homomorphisms theorem 29, 36, 69, 107 

homotopy theory 102, 136, 205, 208-212, 215, 
219, 231-232, 241 

Hopf algebra 166 

isosahedral group YX UW, 111-112, 123 

isosahedron 110-112, 158 

ideal class group ofaring C1A 36, 61, 102- 
103, 235 

— generated by a system ofelements 36, 
69 

— left-, right-, two-sided- 68-69, 73, 83-84 

—ofacommutative ring 26, 28, 32, 36 

—ofaLiealgebra 192 

identity element 11, 100 

— morphism in acategory 204 

— problem (= word problem) for groups 
135-136 

image of ahomomorphism Im f 25, 36, 63, 
74, 106, 226 

imaginary part of a quaternion Im(q) 65 

incidence axioms 8-9, 84 

index 249-258 

—ofasubgroup(G:H) 106, 109 

— ofanelliptic operatorIndD 234 

—theorem 234, 243 

infinite group 64, 124 

infinite-dimensional representation 80, 176- 
177 

infinitesimal 50-51, 55, 192 

instantaneous angular velocity 195 

integraldomain 19 


Subject Index 


—ofadifferential form 216 
—ofmotion 98 

-——overagroupl(f) 168 

invariant differential form 142, 168, 176 


— Hermitian scalar product 80, 148, 165, 168 


— of a division ring (algebra) u,(D) 93-95, 
104 

——ofagroup 183,185 

— quadratic form 80, 114, 115 

— Riemannian metric 142, 198 

— subspace 75, 76, 162, 175 

—theory 160, 183-184 

— vector field 142, 193-194 

inverse x ' 11, 66, 72, 100, 134, 163, 166 

—quaternionq' 65, 66 

invertible element 22 

involution of rings * 67, 71, 191 

irreducible polynomial 14, 22 

— representation (= simple module) 77, 87, 
88, 162, 163, 170, 176 

isomorphism in acategory 207 

— of fields and rings 13, 17, 24, 29, 63 

—of group actions 105 

—ofgroups 101 

— of ideals and modules 35, 36 

—of Lie algebras 191 

— problem for groups 135-136, 139 

isotopic spin (isospin) 172-174, 185-187 


Jacobi identity 190 

jets 55 

Jordan-Holder theorem 78, 154 

Jordan normal form 43, 80, 103 

Jordan’s theorem on finite subgroups of O(n) 
118 

-— — on finite subgroups of GL(n, Z) 119, 
124, 126 


kernel of homomorphism Ker f 26, 63, 68, 
106 

— of integral operator 38 

— of morphism of sheaves 226 

knot group 138-139 

K-theory 230-239 

— K(A), K(A), K,(A), SK,(A) 235-236 

—K,(Z) 238-239 

— K(X), K(X), K(X) 231 

— K, ofa field K(k) 237 


Lagrange’s theorem (on sums of four squares) 
66 

lattice Cc R"_ 115, 126, 143 

Laurent series 16, 21,57, 59 


255 


left coset, — ideal, — invariant, — regular: see 
coset, ideal, invariant, regular 

Legendre’s theorem (on rational! solutions of 
ax? +by?=c) 60,94, 132 

length 77-78, 79, 153-154 

Lie algebra or ring 188-199 

——ofaLie group #(G) 194-197 

—group 125, 140, 142, 143-150, 192, 210, 
240 

see also general linear, orthogonal, unitary, 
special linear, spinor and symplectic 
groups 

— subgroup 143 

— theory 192-199, 206, 240 

Lie’s theorem (on the Lie algebra of a Lie 
group) 196 

linear dependent elements of a field 40 

linear differential operator 19, 34 

linear differential operator of order <r 55 

linearmap 36 

Liouville’s theorem (on integrable systems) 
143, 243 

Lobachevsky plane 105, 131, 149, 169 

local Lie group 196 

long exact cohomology sequence 218-219, 
220-222, 227-228, 231, 236 

——-—-—ofasubspace 218-219 

loop (= closed path) 136, 182, 208 

—spaceQX 208, 210 

Lorentz group O(3, 1), SO(3, 1) 100, 148-149, 
176 


manifold 40, 47, 52, 55, 56, 125, 131, 142, 148, 
182, 189, 192, 201, 214, 216-217, 225, 228, 
229, 233 

matrix algebra or ring M,(K), M,(D) 63, 73, 
78, 83, 89, 190 

maximal ideal 29, 32, 40, 51-52 

maximal compact subgroup 148, 158, 224 

meromorphic functions 16, 47, 181, 226, 229 

mesons 185 

minimal polynomial 48 

Minkowski-Hasse theorem (on rational 
solutions of quadratic equations) 60 

modular group PSL(2,Z) 132-133 

module 33-34, 74-79 

— of differential forms 34, 35 

— of finite type (= finitely generated) 42, 44 

—orrank zero 42 

— over K[x] corresponding to linear 
transformation 34, 42, 43, 77, 80 

—overa PID 43,235 

— over Z 35, 100, 205 
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modulus (= absolute value) of a quaternion |q| 
65 

momentum 98 

monodromy of a differential equation 115, 
162 

the Monster 159 

morphism in acategory 204 

motions 96, 195 

multiplication in modules 36 

— table (= Cayley table) 101, 151 


Noetherian module or ring 44-46 

E. Noether’s theorem (on symmetries and 
conservation laws) 98 

nonassociative (division) algebra 199-200 

noncommutative ring 62 

noncommuting polynomial algebra 
KX1,...,X,> 68 

nonsingular point of a variety 54 

nonstandard analysis 32 

normal subgroup Na G_ 106, 109, 194 

——ofG, and, 109 

normed field 57 

nucleon 185 

number of irreducible representations 
165 


88, 163, 


object of a category Ob(Y) 204 

octahedral groupO = GS, 111-112, 167 

octahedron 110-111 

octavions (= Cayley numbers) O 199-200 

odd permutation 109 

opposite (= skew-isomorphic) ring 67, 74, 78, 
82 

orbit of an element Gx 100, 106 

orbit space (= quotient) G\ X 100, 105, 106, 
125, 126, 169 

order of agroup|G| 100 

—ofagroupelement 107 

orthogonal groups O(n), SO(n), PSO(n), O(», q), 
SO(p, q), SO* (p,q) 96, 107, 141, 144, 145, 
147, 169, 171, 195 

— Lie algebras o(n, K), o(p,q) 190-191, 195 

orthogonality of characters 164, 167, 169 

Ostrowski’s theorem (on valuations of Q) 59 


p-adic field Q, 57, 59, 93, 150, 183, 241 
p-group 156 

Pappus’ theorem or axiom 85, 91, 242 
parity law 100 

partially ordered set 84 

path orloop 136, 208 

periodicity theorem in K-theory 232 


Subject Index 


permutation group 98, 102, 108-109, 119, 
179, 240 

Platonic solids 110-111, 158, 169 

Poincaré-Koebe uniformisation theorem 131 

Poincaré model of Lobachevsky space (upper 
half-plane) 105, 131 

Poisson bracket [ , ] 7, 143, 189 

polyhedron 110, 112, 117,214 

polynomial 18 

— convex hull 25 

—function onacurve 21 

— ring A[x], K[x,,...,X,] 
56-57 

Pontryagin duality theorem 171 

presheaf 225 

prime subfield 30, 48 

— ideal 31,61 

primitive element theorem 49 

principal ideal 26, 27, 36 

——domain (PID) 26, 43, 235 

product in acategory 202, 207 

—incohomology 217 

— of differential forms 

—offields 32 

—ofideals 27 

— of two modules with values in a third 
36-37 

projective limit of rings 55 

—module 221, 235 

— resolution of a module 221, 232 

— space P” 84 

—-—axioms 84 

— — over a division algebra P”"'(D) 84 

Puiseux expansion 59 

purely imaginary quaternion q « H™ 
171 


17-19, 26, 45, 55, 


216 


65, 141, 


quadratic reciprocity 94 

quantum mechanics _ 8, 172, 174, 185 

quark 188 

quasi-algebraically closed field 91 

quaternionge H 65, 91, 146, 152, 171, 
199 

— of modules 1 SpU(1) 141, 145 

quaternionic projective line P'(H) 66 

quotient complex 217 

— group G/N 107 

— in a composition series 

—Liealgebra 192 

—module 36, 42, 75, 76 

—representation 75, 76 

— ring (= residue class ring) 28 

— sheaf 226 


TT, 154 


Subject Index 


radical extension K ("./a) 180 

rank of an algebra over a field 63, 92 

—ofamodule rank M 34, 41 

rational fraction 12 

—function 12 

— function field K(x), K(x,,...,x,) 13 

— function field of a curve or variety K(C) 
14 

real part of a quaternion Re(q) 65 

reduced suspension SX 210-211, 231, 232 

reflection 97,98, 119 

regular action (left-, right-) 105, 193 

— polyhedron 110-114, 158, 169 

—representation 77, 78, 170, 176-177, 209 

representable functors h,, h4 208-209 

representation of Abelian groups 163-165, 
170, 173 

—ofanalgebra 75 

— of classical complex Lie groups 174-177 

—of compact Lie groups 167-174, 185-188 

—of finite groups 119, 163-167, 240 

—ofagroup 75-76, 160-177 

— of semisimple ring 81-83, 88-89 

— of ©, and octahedral group 167 

— of SO(3) 174 

—ofSO(4) 171-172 

—ofSU(2) 172-174, 185-187 

residue class mod! 28 

Ricci (trace-free) tensor 172 

Riemann-Roch theorem 229-230, 234, 241 

Riemann surface 31, 58, 126, 131, 137-138, 
226, 229, 241 

right coset, — ideal, — invariant, — regular: 
see coset, ideal, invariant, regular 

rigid body motions 140, 195, 198 

ring 17, 62 

—axioms 62 


— of bounded operators on Banach space 62, 


69, 87 
— of differential operators 63, 69, 73 
— of linear transformations: see endo- 
morphism—, matrix— 
(see also commutative-, coordinate-, 
semisimple-) 
rotation 110, 112, 140-142 
row and column operations 43, 237 
ruler-and-compass construction 50 


salt NaCl 98 

Schur’s lemma _ 77 

section of a family of vector spaces 40 
semidirect product of groups 223 
semigroup with unit 206, 230 


257 


semisimple module or ring 79, 81, 90, 103, 
166, 167, 169, 175, 184 

set with operations 4, 11, 17, 100 

sheaf 225-230 

sheaf associated with a divisor on a Riemann 
surface F, 226, 229 

sheaf cohomology H"(X,F) 225-230 

short exact sequence 218 

similar lattices 132 

simple algebraic groups 157 

— central algebra 92, 103-104 

— compact Lie groups 157 

— complex Lie groups 157, 177 

— finite groups 159 

—group 154, 155, 158 

— Lie algebra 192, 197 

—Lie groups 157, 197 

— module or ring 72, 73, 77, 81, 82, 84 

simplex 214 

simply connected 137, 142 

singular point of a variety 54 

skew field: see division algebra 

skew-isomorphic (= opposite) ring 67 

skew-isomorphism of rings 67, 71 

smash product X ~ Y 210 

solvable group 155-157, 180 

solving a differential equation by quadratures 
181-182 

—— an equation by radical 180 

special linear group SL(n, K) 144, 149, 195 

— — Lie algebra sl(n,K) 190-191, 195 


spinor group Spin(n), Spin(p,q) 146, 149 
sporadic simple groups 159 
stabiliser subgroup of a point G, 100, 104, 


106, 128 
Stapelia variegata 130 
Stokes’ theorem 216-217 
structure constants of an algebra 63, 192, 194 
subcomplex 217 
subfield 12, 30 
subgroup 104 
see also Lie—, normal— 
submanifold 32, 200—201 
submodule 35 
— generated by a system of elements 36 
subrepresentation (= invariant subspace) 75, 
76, 162, 175 
subring 17, 63 
subsheaf 225 
suminacategory 202, 207 
— of extensions of modules 103 
superalgebra (= Z/2-graded algebra) 70, 216 
suspension YX 210-211, 232 


258 Subject Index 


symbol of an elliptic operator og 234 

symmetry 96-99, 158, 161, 169, 174, 177, 242 

—breaking 174, 185-187 

symmetry group of a crystal 

——ofthe n-cube B, 123 

— — ofa lattice (Bravais group) 115-118, 127 

——ofamolecule 97,113 

——ofanornament 97, 128, 242-243 

—-—ofapolynomial 98 

—-— of physicallaws 98, 99, 242 

—-— ofaregular polyhedron 111-112, 158 

symmetric group S, 108-109, 119, 122, 139, 
162, 180-181 

— power ofa module S’M 39, 184 

—function 98, 180 

— square ofa module S?M 39 

symplectic group Sp(2n,C) 148 


97-98, 126, 242 


Tamagawa number 151 

tangent space 51-54 

tensor algebra of a vector space T(L) 67-68, 
183 

—, covariant or contravariant 

—oftype(p,q)T?? 39 

— power T’(M), T’(p) 

— product of algebras or rings 
204, 207 

——of modules 36-38, 166, 170, 208 

—-—of representations 166, 167 

tetrahedral group TA, 97, 111-112, 167 

tetrahedron 97, 110-112, 158 

topology, topological space 102, 125, 136- 
140, 213, 225 

torsion element or module 

torus 143, 243 

trace 88-89, 90, 173, 195 

transcendence degree trdegL/K 47 

transformation group 96, 100, 192, 209, 240 

transitive action or transformation group 100, 
106, 143, 169 

translation ina Lie group 142, 195 

trivial family of vector spaces 230 

Tsen’s theorem (on division algebras over 
K(C)) 91 


38, 39 


38, 67, 166, 207-208 
92, 104, 166, 


42, 43 


twistor space 66 
two-sided ideal: see ideal 


ultraproduct of fields 32 

uniformisation of Riemann surfaces 
137-138 

unique factorisation 22, 61 

—-—domain(UFD) 22, 26 

unitary groups U(n), SU(n), SU(p, gq), PSU(n) 
144, 158 

— Lie algebra u(n), su(n) 191, 196 

—representation 169, 177 

— symplectic group SpU(n) 145 

— — Lie algebra spu(2n,K) 191 

—trick 175-176 

universal cover 138, 142, 182-183, 197 

universal mapping property 37, 202-204, 207 

unramified cover of aspace 125, 142, 182~183 


126, 131, 


valuation of a field 57-61 

vector bundle 230 

— field 33, 36, 39, 40, 53, 144, 189, 190, 191, 
193, 226, 233 

— space with a linear transformation 34, 42, 
43, 77, 80, 103 


Wedderburn’s theorem (on finite division 
algebras) 91 

Wedderburn’s theorem (on semisimple rings) 
83, 104 

Wedderburn-Remak-Shmidt theorem 153 

Weierstrass approximation theorem 170 

Weierstrass preparation theorem 22 

Weyl tensor 172 

word 134 

word problem (= identity problem) for groups 
135-136 


Z/2-grading 70 
Z-module (= Abelian group) 100, 205 


zeta-function 238 


Q--hyperon 188 


